What is (!). in kdb and are below usecases valid to use it? - kdb

What is (!). called in kdb?
and are below use cases valid to use (!). to convert a list to a dictionary or are there better ways and other uses of (!). ?
Example:
q)(!). (`A`B;(`C`D`E;`F`G`H));
q).[(!);flip (`A`B;`C`D;`E`F)]
I cannot find any documentation on the use cases on (!). in kdb tutorials. Please share any information on (!). and its uses?

It's a version of apply & yep your use case is valid. The reason the operator is wrapped in parentheses is because it itself is a dyadic infix operator as is dot apply (.)
If you attempt to apply it as is, your expression is like so, which Q doesn't like
// infixOp infixOp operand
q)+ . 4 5
'
[0] + . 4 5
^
Wrapping the operator within parentheses effectively transforms it so the expression now becomes
// operand infixOp operand
q)(+). 4 5
9
If you define a function which can't be used infix, then there's no need to wrap it
q)f:+
q)4 f 5
'type
[0] 4 f 5
^
q)f . 4 5
9
If using apply with bracket notation as in your example, there's no need to wrap the function
q).[+;4 5]
9
https://code.kx.com/q/ref/apply/#apply-index
https://code.kx.com/q/basics/syntax/#parentheses-around-a-function-with-infix-syntax
Jason

In terms of use-cases, I find it very useful when defining dictionaries/tables as configs particularly when dictionaries are too wide (horizontal) for the screen or when it's more useful to see fields/mappings vertically as pairs. From a code/script point of view that is.
For example:
mapping:(!) . flip(
(`one; 1);
(`two; 2);
(`three; 3));
is much easier to read when scanning through a q script than
mapping2:`one`two`three!1 2 3
when the latter gets very wide.
It makes no difference to the actual dictionary of course because as Jason pointed out it's the same thing.

Related

What is the general pattern behind (dyadic) function composition's syntax?

The Q Tips book (Nick Psaris) shows the
following function (Chapter 10):
q)merge:`time xdesc upsert
As it is stated, it corresponds to function composition. I see the pattern: the
expression supplies a function that takes both arguments for upsert and then
uses its result to feed time xdesc. However the syntax feels weird, since
I would expect upsert to be the second argument of the xdesc invocation.
Aiming at simplifying the expression, I could see that the very same scenario
applies here:
q)f:1+*
q)f[2;3]
7
If we show its result, we can clearly see that f behaves as expected:
q)f
+[1]*
However, If we slightly modify the function, the meaning of the expression is
completely different:
q)g:+[1;]*
q)g[2;3]
'rank
[0] g[2;3]
^
In fact, +[1;] is passed as first argument to the * operator instead,
leading us to a rank error:
q)g
*[+[1;]]
I could also notice that the pattern breaks when the first function is
"monadic":
q)h:neg *
q)h[2;3]
'rank
[0] h[2;3]
^
Also here:
q)i:neg neg
'type
[0] i:neg neg
^
At this point, my intuition is that this pattern only applies when we are
interested on composing dyadic standard (vs user-defined) operators that exploit infix
notation. Am I getting it right? Is this syntactic sugar actually more general? Is there any
documentation where the pattern is fully described? Thanks!
There are some documented ways to achieve what you wish:
https://code.kx.com/q/ref/apply/#composition
You can create a unary train using #
q)r:neg neg#
q)r 1
1
https://code.kx.com/q/ref/compose/
You can use ' to compose a unary value with another of rank >=1
q)f:('[1+;*])
q)f[2;3]
7
Likely the behaviour you are seeing is not officially there to be exploited by users in q so should not be relied upon. This link may be of interest:
https://github.com/quintanar401/DCoQ

kdb: differences between value and eval

From KX: https://code.kx.com/q/ref/value/ says, when x is a list, value[x] will be result of evaluating list as a parse tree.
Q1. In code below, I understand (A) is a parse tree, given below definition. However, why does (B) also work? Is ("+";3;4) a valid parse tree?
q)value(+;3;4) / A
7
q)value("+";3;4) / B
7
q)eval(+;3;4) / C
7
q)eval("+";3;4) / D
'length
[0] eval("+";3;4)
Any other parse tree takes a form of a list, of which the first item
is a function and the remaining items are its arguments. Any of these
items can be parse trees. https://code.kx.com/q/basics/parsetrees/
Q2. In below code, value failed to return the result of what I think is a valid parse tree, but eval works fine, recursively evaluating the tree. Does this mean the topmost description is wrong?
q)value(+;3;(+;4;5))
'type
[0] value(+;3;(+;4;5))
^
q)eval(+;3;(+;4;5))
12
Q3. In general then, how do we choose whether to use value or eval?
put simply the difference between eval and value is that eval is specifically designed to evaluate parse trees, whereas value works on parse trees among other operations it does. For example value can be used to see the non-keyed values of dictionaries, or value strings, such as:
q)value"3+4"
7
Putting this string instead into the eval, we simply get the string back:
q)eval"3+4"
"3+4"
1 Following this, the first part of your question isn't too bad to answer. The format ("+";3;4) is not technically the parsed form of 3+4, we can see this through:
q)parse"3+4"
+
3
4
The good thing about value in this case is that it is valuing the string "+" into a the operator + and then valuing executing the parse tree. eval cannot understand the string "+" as this it outside the scope of the function. Which is why A, B and C work but not D.
2 In part two, your parse tree is indeed correct and once again we can see this with the parse function:
q)parse"3+(4+5)"
+
3
(+;4;5)
eval can always be used if your parse tree represents a valid statement to get the result you want. value will not work on all parse tree's only "simple" ones. So the nested list statement you have here cannot be evaluated by value.
3 In general eval is probably the best function of choice for evaluating your parse trees if you know them to be the correct parse tree format, as it can properly evaluate your statements, even if they are nested.

Expressions/operator precedence in Amend At and in function parameters

I always thought that in q and in k all expressions divided ; evaluated left-to-right and operator precedence inside is right-to-left.
But then I tried to apply this principle to Ament At operator parameters. Confusingly it seems working in the opposite direction:
$ q KDB+ 3.6 2019.04.02 Copyright (C) 1993-2019 Kx Systems
q)#[10 20 30;g;:;100+g:1]
10 101 30
The same precedence works inside a function parameters too:
q){x+y}[q;10+q:100]
210
So why does it happend - why does it first calculate the last one parameter and only then first? Is it a feature we should avoid?
Upd: evaluation vs parsing.
There could be another cases: https://code.kx.com/q/ref/apply/#when-e-is-not-a-function
q)#[2+;"42";{)}]
')
[0] #[2+;"42";{)}]
q)#[string;42;a:100] / expression not a function
"42"
q)a // but a was assigned anyway
100
q)#[string;42;{b::99}] / expression is a function
"42"
q)b // not evaluated
'b
[0] b
^
The semicolon is a multi-purpose separator in q. It can separate statements (e.g. a:10; b:20), in which case statements are evaluated from left-to-right similar to many other languages. But when it separates elements of a list it creates a list expression which (an expression) is evaluated from right-to-left as any other q expression would be.
Like in this example:
q)(q;10+q:100)
110 100
One of many overloads of the dot operator (.) evaluates its left operand on a list of values in its right operand:
q){x+y} . (q;10+q:100)
210
In order to do so a list expression itself needs to be evaluated first, and it will be, from right-to-left as any other list expression.
However, the latter is just another way of getting the result of
{x+y}[q;10+q:100]
which therefore should produce the same value. And it does. By evaluating function arguments from right-to-left, of course!
A side note. Please don't be confused by the conditional evaluation statement $[a;b;c]. Even though it looks like an expression it is in fact a statement which evaluates a first and only then either b or c. In other words a, b and c are not arguments of some function $ in this case.

Can I write a PCRE conditional that only needs the no-match part?

I am trying to create a regular expression to determine if a string contains a number for an SQL statement. If the value is numeric, then I want to add 1 to it. If the number is not numeric, I want to return a 1. More or less. Here is the SQL:
SELECT
field,
CASE
WHEN regexp_like(field, '^ *\d*\.?\d* *$') THEN dec(field) + 1
ELSE 1
END nextnumber
FROM mytable
This actually works, and returns something like this:
INVALID 1
00000 1
00001E 1
00379 380
00013 14
99904 99905
But to push the envelope of understanding, what if I wanted to cover negative numbers, or those with a positive sign. The sign would have to immediately precede or follow the number, but not both, and I would not want to allow white space between the sign and the number.
I came up with a conditional expression with a capture group to capture the sign on the front of the number to determine if a sign was allowed on the end, but it seems a little awkward to handle given I don't really need a yes-pattern.
Here is the modified regex: ^ ([+-]?)*\d*\.?\d*(?(1) *|[+-]? *)$
This works at regex101.com, but in order for it to work I need to have something before the pipe, so I have to duplicate the next pattern in both the yes-pattern and the no-pattern.
All that background for this question: How can I avoid that duplication?
EDIT: DB2 for i uses International Components for Unicode to provide regular expression processing. It turns out that this library does not support conditionals like PRCE, so I changed the tags on this question. The answer given by Wiktor Stribiżew provides a working alternative to the conditional by using a negative lookahead.
You do not have to duplicate the end pattern, just move it outside the conditional:
^ *([+-])?\d*\.?\d*(?(1)|[+-]?) *$
See the regex demo. So, the yes-part is empty, and the no-part has an optional pattern.
You may also solve it with a mere negative lookahead:
^ *([+-](?!.*[-+]))?\d*\.?\d*[+-]? *$
See another regex demo. Here, ([+-](?!.*[-+]))? matches (optionally) a + or - that are not followed with any 0+ char followed with another + or -.

Is 35.+(10) still called infix notation?

I nearly talked about
35.+(10)
as an example of postfix notation because I understood
35 + 10
to be infix notation (at least everyone talks about that as an example of infix notation). But that's wrong isn't it?
35 10 +
would be postfix.
So how do I distinguish between the first two examples by name? Are they both "infix" but the second just a neater way?
Indeed it is still "infix".
Postfix means that all operands to an operator come in the stream before the operator itself. (an example is the factorial "!" operator in mathematics)
Prefix means the operator comes before the operands (an example is the "negate"/"-" operator to make a number negative).
Infix simply means that the operator is somewhere between the operands.
To decide how to name the application syntax, break the fragment up into tokens.
35.+(10)
is
[35] [.] [+] [(] [10] [)]
dropping the redundant parens, and let's name '.' as 'apply' we get:
[35] [apply] [+] [10]
So it most certainly is infix, as the binary operator is between the first and second argument.
It's just a bit noisy for what is also written as 35 + 10