Why does 'sum' work but '+/' not work in KDB queries? - kdb

Why does sum work here, but the underlying form +/ not work? (Taken from https://code.kx.com/q/ref/sum/)
t: ([]name:`Jack`Jill`Janet;hair:`brown`black`fair;eye:`blue`green`hazel;age:12 9 14)
q)select sum age from t
age
---
35
q)select +/age from j
'/
[0] select +/age from j
^

This is because +/ is k syntax. To invoke it (and similar k constructs) in q you will need to wrap it in parentheses.
select enlist (+/)age from j
In general, if an inbuilt q keyword exists for the associated k expression, you should use the keyword (sum in this case) as it likely carries further optimisations.
In the case of sum q will automatically enlist the result inside a select statement which (+/) won't do. Hence why I have done it manually above. Otherwise expect a 'rank error.

Related

How can one drop/delete columns from a KDB table in place?

Following the documentation, I tried to do the following:
t:([]a:1 2 3;b:4 5 6;c:`d`e`f) // some input table
`a`b _ t // works: delete NOT in place
(enlist `a) _ t // works: delete NOT in place
t _:`a`b // drop columns in place does not work; how to make it to work?
// 'type
// [0] t _:`a`b
Thank you very much for your help!
You should be able to use
delete a,b from `t
to delete in place (The backtick implies in place).
Alternatively, for more flexibility you could use the functional form;
![`t;();0b;`a`b]
The simplest way to achieve column deletion in place is using qSQL:
t:([]a:1 2 3;b:4 5 6;c:`d`e`f)
delete a,b from `t -- here, the backtick before t makes the change in place.
q)t
c
-
d
e
f
Michael & Kyle have covered the q-SQL options; for completeness, here are a couple of other options using _:
Using _ as in your question, you can re-assign this back to t e.g.
t:`a`b _ t
You can also use . amend with an empty list of indexes i.e. "amend entire", which can be done in-place by passing `t or not in-place by passing just t e.g.
q).[t;();`a`b _] / not in-place
c
-
d
e
f
q).[`t;();`a`b _] / in-place
`t
q)t
c
-
d
e
f

How can I convert this select statement to functional form?

I am having a couple of issues to put this in a functional format.
select from tableName where i=fby[(last;i);([]column_one;column_two)]
This is what I got:
?[tableName;fby;enlist(=;`i;(enlist;last;`i);(+:;(!;enlist`column_one`column_two;(enlist;`column_one;`column_two))));0b;()]
but I get a type error.
Any suggestions?
Consider using the following function, adjust from the buildQuery function given in the whitepaper on Parse Trees. This is a pretty useful tool for quickly developing in q, this version is an improvement on that given in the linked whitepaper, having been extended to handle updates by reference (i.e., update x:3 from `tab)
\c 30 200
tidy:{ssr/[;("\"~~";"~~\"");("";"")] $[","=first x;1_x;x]};
strBrk:{y,(";" sv x),z};
//replace k representation with equivalent q keyword
kreplace:{[x] $[`=qval:.q?x;x;"~~",string[qval],"~~"]};
funcK:{$[0=t:type x;.z.s each x;t<100h;x;kreplace x]};
//replace eg ,`FD`ABC`DEF with "enlist`FD`ABC`DEF"
ereplace:{"~~enlist",(.Q.s1 first x),"~~"};
ereptest:{((0=type x) & (1=count x) & (11=type first x)) | ((11=type x)&(1=count x))};
funcEn:{$[ereptest x;ereplace x;0=type x;.z.s each x;x]};
basic:{tidy .Q.s1 funcK funcEn x};
addbraks:{"(",x,")"};
//where clause needs to be a list of where clauses, so if only one whereclause need to enlist.
stringify:{$[(0=type x) & 1=count x;"enlist ";""],basic x};
//if a dictionary apply to both, keys and values
ab:{$[(0=count x) | -1=type x;.Q.s1 x;99=type x;(addbraks stringify key x),"!",stringify value x;stringify x]};
inner:{[x]
idxs:2 3 4 5 6 inter ainds:til count x;
x:#[x;idxs;'[ab;eval]];
if[6 in idxs;x[6]:ssr/[;("hopen";"hclose");("iasc";"idesc")] x[6]];
//for select statements within select statements
//This line has been adjusted
x[1]:$[-11=type x 1;x 1;$[11h=type x 1;[idxs,:1;"`",string first x 1];[idxs,:1;.z.s x 1]]];
x:#[x;ainds except idxs;string];
x[0],strBrk[1_x;"[";"]"]
};
buildSelect:{[x]
inner parse x
};
We can use this to create the functional query that will work
q)n:1000
q)tab:([]sym:n?`3;col1:n?100.0;col2:n?10.0)
q)buildSelect "select from tab where i=fby[(last;i);([]col1;col2)]"
"?[tab;enlist (=;`i;(fby;(enlist;last;`i);(flip;(lsq;enlist`col1`col2;(enlist;`col1;`col2)))));0b;()]"
So we have the following as the functional form
?[tab;enlist (=;`i;(fby;(enlist;last;`i);(flip;(lsq;enlist`col1`col2;(enlist;`col1;`col2)))));0b;()]
// Applying this
q)?[tab;enlist (=;`i;(fby;(enlist;last;`i);(flip;(lsq;enlist`col1`col2;(enlist;`col1;`col2)))));0b;()]
sym col1 col2
----------------------
bah 18.70281 3.927524
jjb 35.95293 5.170911
ihm 48.09078 5.159796
...
Glad you were able to fix your problem with converting your query to functional form.
Generally it is the case that when you use parse with a fby in your statement, q will convert this function into its k definition. Usually you should just be able to replace this k code with the q function itself (i.e. change (k){stuff} to fby) and this should run properly when turning the query into functional form.
Additionally, if you check out https://code.kx.com/v2/wp/parse-trees/ it goes into more detail about parse trees and functional form. Additionally, it contains a script called buildQuery which will return the functional form of the query of interest as a string which can be quite handy and save time when a functional form is complex.
I actually got it myself ->
?[tableName;((=;`i;(fby;(enlist;last;`i);(+:;(!;enlist`column_one`column_two;(enlist;`column_one;`column_two)))));(in;`venue;enlist`venueone`venuetwo));0b;()]
The issues was a () missing from the statement. Works fine now.
**if someone wants to add a more detailed explanation on how manual parse trees are built and how the generic (k){} function can be replaced with the actual function in q feel free to add your answer and I'll accept and upvote it

kdb apply function in select by row

I have a table
t: flip `S`V ! ((`$"|A|B|"; `$"|B|C|D|"; `$"|B|"); 1 2 3)
and some dicts
t1: 4 10 15 20 ! 1 2 3 5;
t2: 4 10 15 20 ! 0.5 2 4 5;
Now I need to add a column with values on the the substrings in S and the function below (which is a bit pseudocode because I am stuck here).
f:{[s;v];
if[`A in "|" vs string s; t:t1;];
else if[`B in "|" vs string s; t:t2;];
k: asc key t;
:t k k binr v;
}
problems are that s and v are passed in as full column vectors when I do something like
update l:f[S,V] from t;
How can I make this an operation that works by row?
How can I make this a vectorized function?
Thanks
You will want to use the each-both adverb to apply a function over two columns by row.
In your case:
update l:f'[S;V] from t;
To help with your pseudocode function, you might want to use $, the if-else operator, e.g.
f:{[s;v]
t:$["A"in ls:"|"vs string s;t1;"B"in ls;t2;()!()];
k:asc key t;
:t k k binr v;
};
You've not mentioned a final else clause in your pseudocode but $ expects one hence the empty dictionary at the end.
Also note that in your table the columns S and V have been cast to a symbol. vs expects a string to split so I've had to use the stringoperation - this could be removed if you are able to redefine your original table.
Hope this helps!

Quicksort in Q/KDB+

I found this quicksort implementation on a website:
q:{$[2>distinct x;x;raze q each x where each not scan x < rand x]};
I don't understand this part:
raze q each x where each not scan x < rand x
Can someone explain it to me step by step?
Lets do it step by step . I assume you have basic understanding of Quick Sort algo. Also, there is one correction in code you mentioned which I have corrected in step 5.
Example list:
q)x: 1 0 5 4 3
Take a random element from list which will act as pivot.
q) rand x
Suppose it gives us '4' from list.
Split list 'x' in 2 lists. One contains elements lesser that '4' and other greater(or equal) to '4'.
2.a) First compare all elements with pivot (4 in our case)
q) (x<rand x) / 11001b : output is boolean list
2.b) Using above boolean list we can get all elements from 'x' lesser than '4'. Here is the way:
q) x where 11001b / ( 1 0 3) : output
So we require other expression to get all elements greater(or equal) than pivot '4'. There are many ways to do it
but lets see the one used in code:
q)not scan (x<rand x) / (11001b;00110b) : output
So it gives the list which has 2 lists. First is result of (x < rand x) which is used to get elements lesser than pivot '4' and other is negation of this list which is done by 'not' and it is used to get all elements greater(or equal) that pivot '4'.
2.c) So now we can generate 2 lists using sample code from (2.b)
q) x where each (not scan (x<rand x)) / ((1 0 3);(5 4)): output list which has 2 lists
Now apply same function to each list to sort each of them
i.e. recursive call on each list of list ((1 0 3);(5 4))
q) q each x where each (not scan (x<rand x))
After all calculations , apply 'raze' to flatten all lists that are returned from each recursive call to output one single list.
End condition for recursive call is: when input list has only 1 distinct element just return it.
q) 2>count distinct x
Note: There is one correction. 'count' was missing in original code.

Is this the simplified version of this boolean expression? Or is this reviewer wrong

Cause I've tried doing the truth table unfortunately one has 3 literals and the other has 4 so i got confused.
F = (A+B+C)(A+B+D')+B'C;
and this is the simplified version
F = A + B + C
http://www.belley.org/etc141/Boolean%20Sinplification%20Exercises/Boolean%20Simplification%20Exercise%20Questions.pdf
cause I think there's something wrong with this reviewer.. or is it accurate?
btw is simplification different from minimizing from Sum of Minterms to Sum of Products?
Yes, it is the same.
Draw the truth table for both expressions, assuming that there are four input variables in both. The value of D will not play into the second truth table: values in cells with D=1 will match values in cells with D=0. In other words, you can think of the second expression as
F = A +B + C + (0)(D)
You will see that both tables match: the (A+B+C)(A+B+D') subexpression has zeros in ABCD= {0000, 0001, 0011}; (A+B+C) has zeros only at {0000, 0001}. Adding B'C patches zero at 0011 in the first subexpressions, so the results are equivalent.