Expressions/operator precedence in Amend At and in function parameters - kdb

I always thought that in q and in k all expressions divided ; evaluated left-to-right and operator precedence inside is right-to-left.
But then I tried to apply this principle to Ament At operator parameters. Confusingly it seems working in the opposite direction:
$ q KDB+ 3.6 2019.04.02 Copyright (C) 1993-2019 Kx Systems
q)#[10 20 30;g;:;100+g:1]
10 101 30
The same precedence works inside a function parameters too:
q){x+y}[q;10+q:100]
210
So why does it happend - why does it first calculate the last one parameter and only then first? Is it a feature we should avoid?
Upd: evaluation vs parsing.
There could be another cases: https://code.kx.com/q/ref/apply/#when-e-is-not-a-function
q)#[2+;"42";{)}]
')
[0] #[2+;"42";{)}]
q)#[string;42;a:100] / expression not a function
"42"
q)a // but a was assigned anyway
100
q)#[string;42;{b::99}] / expression is a function
"42"
q)b // not evaluated
'b
[0] b
^

The semicolon is a multi-purpose separator in q. It can separate statements (e.g. a:10; b:20), in which case statements are evaluated from left-to-right similar to many other languages. But when it separates elements of a list it creates a list expression which (an expression) is evaluated from right-to-left as any other q expression would be.
Like in this example:
q)(q;10+q:100)
110 100
One of many overloads of the dot operator (.) evaluates its left operand on a list of values in its right operand:
q){x+y} . (q;10+q:100)
210
In order to do so a list expression itself needs to be evaluated first, and it will be, from right-to-left as any other list expression.
However, the latter is just another way of getting the result of
{x+y}[q;10+q:100]
which therefore should produce the same value. And it does. By evaluating function arguments from right-to-left, of course!
A side note. Please don't be confused by the conditional evaluation statement $[a;b;c]. Even though it looks like an expression it is in fact a statement which evaluates a first and only then either b or c. In other words a, b and c are not arguments of some function $ in this case.

Related

What is the general pattern behind (dyadic) function composition's syntax?

The Q Tips book (Nick Psaris) shows the
following function (Chapter 10):
q)merge:`time xdesc upsert
As it is stated, it corresponds to function composition. I see the pattern: the
expression supplies a function that takes both arguments for upsert and then
uses its result to feed time xdesc. However the syntax feels weird, since
I would expect upsert to be the second argument of the xdesc invocation.
Aiming at simplifying the expression, I could see that the very same scenario
applies here:
q)f:1+*
q)f[2;3]
7
If we show its result, we can clearly see that f behaves as expected:
q)f
+[1]*
However, If we slightly modify the function, the meaning of the expression is
completely different:
q)g:+[1;]*
q)g[2;3]
'rank
[0] g[2;3]
^
In fact, +[1;] is passed as first argument to the * operator instead,
leading us to a rank error:
q)g
*[+[1;]]
I could also notice that the pattern breaks when the first function is
"monadic":
q)h:neg *
q)h[2;3]
'rank
[0] h[2;3]
^
Also here:
q)i:neg neg
'type
[0] i:neg neg
^
At this point, my intuition is that this pattern only applies when we are
interested on composing dyadic standard (vs user-defined) operators that exploit infix
notation. Am I getting it right? Is this syntactic sugar actually more general? Is there any
documentation where the pattern is fully described? Thanks!
There are some documented ways to achieve what you wish:
https://code.kx.com/q/ref/apply/#composition
You can create a unary train using #
q)r:neg neg#
q)r 1
1
https://code.kx.com/q/ref/compose/
You can use ' to compose a unary value with another of rank >=1
q)f:('[1+;*])
q)f[2;3]
7
Likely the behaviour you are seeing is not officially there to be exploited by users in q so should not be relied upon. This link may be of interest:
https://github.com/quintanar401/DCoQ

kdb: differences between value and eval

From KX: https://code.kx.com/q/ref/value/ says, when x is a list, value[x] will be result of evaluating list as a parse tree.
Q1. In code below, I understand (A) is a parse tree, given below definition. However, why does (B) also work? Is ("+";3;4) a valid parse tree?
q)value(+;3;4) / A
7
q)value("+";3;4) / B
7
q)eval(+;3;4) / C
7
q)eval("+";3;4) / D
'length
[0] eval("+";3;4)
Any other parse tree takes a form of a list, of which the first item
is a function and the remaining items are its arguments. Any of these
items can be parse trees. https://code.kx.com/q/basics/parsetrees/
Q2. In below code, value failed to return the result of what I think is a valid parse tree, but eval works fine, recursively evaluating the tree. Does this mean the topmost description is wrong?
q)value(+;3;(+;4;5))
'type
[0] value(+;3;(+;4;5))
^
q)eval(+;3;(+;4;5))
12
Q3. In general then, how do we choose whether to use value or eval?
put simply the difference between eval and value is that eval is specifically designed to evaluate parse trees, whereas value works on parse trees among other operations it does. For example value can be used to see the non-keyed values of dictionaries, or value strings, such as:
q)value"3+4"
7
Putting this string instead into the eval, we simply get the string back:
q)eval"3+4"
"3+4"
1 Following this, the first part of your question isn't too bad to answer. The format ("+";3;4) is not technically the parsed form of 3+4, we can see this through:
q)parse"3+4"
+
3
4
The good thing about value in this case is that it is valuing the string "+" into a the operator + and then valuing executing the parse tree. eval cannot understand the string "+" as this it outside the scope of the function. Which is why A, B and C work but not D.
2 In part two, your parse tree is indeed correct and once again we can see this with the parse function:
q)parse"3+(4+5)"
+
3
(+;4;5)
eval can always be used if your parse tree represents a valid statement to get the result you want. value will not work on all parse tree's only "simple" ones. So the nested list statement you have here cannot be evaluated by value.
3 In general eval is probably the best function of choice for evaluating your parse trees if you know them to be the correct parse tree format, as it can properly evaluate your statements, even if they are nested.

kdb/q question: How do I interpret this groupby in my functional selection?

I am new to kdb/q and am trying to figure out what this particular query means. The code is using functional select, which I am not overly comfortable with.
?[output;();b;a];
where output is some table which has columns size time symbol
the groupby filter dictionary b is defined as follows
key | value
---------------
ts | ("+";00:05:00v;("k){x*y div x:$[16h=abs[#x];"j"$x;x]}";00:05:00v;("%:";`time)))
sym | ("k){x'y}";"{`$(,/)("/" vs string x)}";`symbol)
For the sake of completeness, dictionary a is defined as
volume ("sum";`size)
In effect, the functional select seems to be bucketing the data into 5 minute buckets and doing some parsing in symbol. What baffles me is how to read the groupby dictionary. Especially the k)" part and the entire thing being in quotes. Can someone help me go through this or point me to resources that can help me understand? Any input will be appreciated.
The aggregation part of the function form takes a dictionary, the key being the output key column names and the values being parse tree functions.
A parse tree is an expression that is not immediately evaluated. The first argument as a function and subsequent elements are its arguments. The inner-most brackets are evaluated first and then it moves up the heirarchy, evaluating each one in turn. More detailed information can be found here and in the whitepaper linked on that page
You can use the function parse with a string argument to get the parse tree of a function. For example, the parse tree for 1+2+3 is (+;1;(+;2;3)):
q)parse "1+2+3"
+
1
(+;2;3)
The inner-most bracket (+;2;3) is evaluated first resulting in 5, before the result is propogated up to the outmost parse tree function (+;1;5) giving 6
The groupby part of the clause will evaluate one or more parse tree functions and then will collect together records with the same output from the grouping function.
Making the function a bit clearer to read:
(+;00:05:00v;({x*y div x:$[16h=abs[#x];"j"$x;x]}";00:05:00v;(%:;`time)))
Looking at the inner most bracket (%:;`time), it returns the result of %: applied on the time column. We can see that %: is k for the function ltime
q)ltime
%:
Moving up a level, the next function evaluated is the lambda function {x*y div x:$[16h=abs[#x];"j"$x;x]} with arguments 00:05:00v and the result of our previous evaluated function. The lambda rounds it down the the nearest 5 minute interval
({x*y div x:$[16h=abs[#x];"j"$x;x]};00:05:00v;(%:;`time))
Moving up once more to the whole expression it is equivalent to 00:05:00v + {x*y div x:$[16h=abs[#x];"j"$x;x]};00:05:00v;(%:;`time)), with 00:05:00 being added onto each result from the previous evaluation.
So essentially it first returns the local time of the timestamp, then
For the symbol aggregation
("k""{x'y}";{`$(,/)("/" vs string x)};`symbol)
The inner function {`$(,/)("/" vs string x)} strings a symbol, splits it at "/" character and then joins it back together, effectively removing the slash
The "k" is a function that evaluates the string using the k interpreter.
"k""{x'y}"" returns a function which itself takes a function x and argument y and modifies the function to use the each-both adverb '. This makes it so that the function x is applied on each symbol individually as opposed to the column as a whole.
This could be implemented in q instead of k like so:
({x#'y};{`$(,/)("/" vs string x)};`symbol)
The function {x#'y} takes the function argument {`$(,/)("/" vs string x)} and the symbol column as before, but we have to use # with the each-both adverb in q to apply the function on the arguments.
The aggregation function will then be applied to each group. In your case the function is a simple parse tree, which will return the sum of the size columns in each group, with the output column called volume
a:enlist[`volume]!enlist (sum;`size)

kdb - how does the 'where' function work

I want to understand what is happening with the below statement:
sum(til 100) where 000011110000b
That line evaluates to 22, and I can't figure out why. sum(til 100) is 4950. where 000011110000b returns the list 4 5 6 7. The kdb reference page doesn't seem to explain this use case. Why does the above line evaluate to 22?
Also, why does the below line result in error
4950 where 000011110000b
Square brackets are used for function call arguments, rather than parentheses. So the above is being interpreted as:
sum[(til 100) where 000011110000b]
And you can probably see now why that would evaluate to 22, i.e. it is the sum of the 5th,6th,7th,8th values of the list til 100, i.e. (0...99), because where is indexing into til 100 using the boolean list 000011110000b
q)til[100] where 000011110000b
4 5 6 7
As you will have noticed, you can omit square brackets when calling functions; but when doing so you need to be sure it will be parsed/interpreted as intended. In general, code is evaluated right-to-left so in this case (til 100) where 000011110000b was evaluated first, and the result passed to the sum function.
Regarding 4950 where 000011110000b - again if we think right to left, the interpreter is trying to pass the result of where 000011110000b to the function 4590. Although 4590 isn't a named function in kdb - this is the format used for IPC, i.e. executing commands on other kdb processes over TCP. So you are telling kdb to use the OS file descriptor number 4590 (which almost certainly won't correspond to a valid TCP connection) to run the command where 000011110000b. See the "message formats" section here for more details:
http://code.kx.com/q/tutorials/startingq/ipc/

Is it possible to influence how '_' lambda arguments are ordered?

I have a statement which currently looks like this:
arrays.foldLeft(0)((offset, array) => array.copyTo(largerArray, offset))
It would be great to express it as follows, but this is not possible since foldLeft is designed to take the seed argument first:
arrays.foldLeft(0)(_.copyTo(largerArray, _))
This is purely superficial - I'm just curious!
p.s. copyTo returns the next offset in this example.
The SLS seems to say "no".
Section 6.23, Anonymous Functions/Placeholder Syntax for Anonymous Functions:
An expression (of syntactic category Expr) may contain embedded
underscore symbols _ at places where identifiers are legal. Such an
expression represents an anonymous function where subsequent
occurrences of underscores denote successive parameters.
and
If an expression e binds underscore sections u1 , . . . , un, in this order, it is equivalent to the anonymous function (u'1 , ... u'n ) => e' where each u'i results from ui by replacing the
underscore with a fresh identifier and e' results from e by
replacing each underscore section ui by u'i.
Emphasis is mine - it explicitly states in both relevant section that a preserved ordering is assumed.
Personally, I think it makes sense to enforce that, if "only" for readability reasons.