Binary application on scan in KDB? - kdb

I'm trying to understand this:
100+\ 1 2 3
101 103 106
Which works fine.
Question 1:
When I wrap this in brackets, I get an error I wasn't expecting:
(100+\) 1 2 3
'Cannot write to handle 100. OS reports: Bad file descriptor
What am I doing wrong here? It doesn't look like I'm writing a file to me.
Question 2:
Given the +[1;2] = 3, I believe this:
+[100;]\ 1 2 3
'
[0] +[100;]\ 1 2 3
(or perhaps +[;100]\ 1 2 3) should also work with projection, but it doesn't. What am I doing wrong here?

Question 1:
Use parse to determine order of execution
q)show pt:parse "(100+\\)1 2 3"; // need to escape \
((\;+);100)
1 2 3
q)eval each pt // should be clearer now
100
1 2 3
q)
q)value eval each pt // attempting to apply 100 to list which cannot be done
'Cannot write to handle 100. OS reports: Bad file descriptor
[0] value eval each pt
^
Question 2:
The projection is unary & is applied to the entire right argument. With unary application, evaluations will (attempt to) continue until convergence - https://code.kx.com/q/ref/accumulators/#unary-values
q)(neg\)1 2 3
1 2 3
-1 -2 -3
q)+[100]\[1 2 3]
'wsfull
m 0 68157440

Question 1
When an iterator (here \) is applied postfix (as is usual) to a function (here +) it derives a function (here +\) that is both variadic (as per #mturkington) and has infix syntax. You can apply it as a unary or as a binary. Your example 100+\1 2 3 applies it as a binary.
The parser needs a clue if you want to apply +\ as a unary. You can apply any function using bracket notation. Or you can parenthesise it: (+\) has noun syntax, as does the list (+;-;*;%). You can apply or index a noun with prefix syntax.
q)100+\1 2 3 / binary application, infix syntax
101 103 106
q)+\[100;1 2 3] / binary application, bracket syntax
101 103 106
q)+\[1 2 3] / unary application, bracket syntax
1 3 6
q)(+\)1 2 3 / unary application, prefix syntax
1 3 6
Question 2
You don’t say what result you expect from using the projection. I’ll assume you’re exploring a different way of getting the same result as in Q1.
The key issue here is that the projection of binary Add on 100 is a unary +[;100] (or +[100] or just 100+), and the accumulators \ and / applied to a unary are the Converge, Do and While iterators.
None of these gives you the Q1 result. For unary f, the derived function f\ just keeps applying f successively.
q)5 +[100]\ 1 2 3 / do 100+ five times
1 2 3
101 102 103
201 202 203
301 302 303
401 402 403
501 502 503

In this case, +\ is the underlying code for the sums keyword. This is one of several keywords that are known as variadic because their rank is not fixed. When you try (100+\) 1 2 3, kdb is actually applying the equivalent of sums to your input list, then trying to write that to handle 100, which of course doesn't exist. So that's why you get the error you get.
As for the syntax in Question 2, the following should work (adapted from this page on the variadic syntax)
q)+\[100;1 2 3]
101 103 106

Related

SPSS Merging Data with duplicate Keys

I am currently attempting to join 2 datasets using SPSS syntax but am struggling as I have duplicate values on the keys. I would like for the joined data to be duplicated for each instance of the key on the source dataset (or other way round as it doesn't matter which is the source).
The datasets are like the following -
Data1 (3rd column placeholder)
batch
run
date
A
1
1
A
2
1
A
3
1
B
1
1
C
1
1
C
2
1
D
1
1
E
1
1
Data2
batch
Value1
Value2
A
1
21
A
2
22
A
3
23
A
4
24
B
5
25
B
6
26
B
7
27
B
8
28
C
9
29
C
10
30
C
11
31
C
12
32
D
13
33
D
14
34
D
15
35
D
16
36
E
17
37
E
18
38
E
19
39
E
20
40
Current attempt
What I have just now is a method where I CASETOVARS on Data1 before matching it onto Data2 and then VARSTOCASES to expand it out. This works perfectly with my test data but, unfortunately, it requires that I know exactly how many 'runs' there will be. That will not be known in production. It could be 1 or more.
Is there a method to join these datasets while expanding the joined data into the multliple cases in the source?
I am open to using macros but am not able to utilise Python solutions for this (which would probably be easier!).
edit - Unfortunately, extensions are also not possible for me to use.
CASESTOVARS
/ID = batch .
DATASET ACTIVATE data2 .
MATCH FILES
/FILE = *
/TABLE = data1
/BY batch .
EXECUTE .
VARSTOCASES
/MAKE run FROM BATCH_RUN_ID.1 TO BATCH_RUN_ID.3 .
EXECUTE .
If Python and dependent extention command are not availabe, here's an idea how to solve the dynamic list length for the varstocases phase.
What you'll do is basically to create a new dataset with the maximum number of runs possible, attach your read dataset to it, and then set the varstocases to go for that maximum number of runs (blank rows are dropped automatically):
dataset name orig.
data list free/throwthisrow (f1) BATCH_RUN_ID.1 to BATCH_RUN_ID.50 (50F8.2) .
begin data
1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
end data.
add files /file=* /file=orig .
EXECUTE.
select if missing(throwthisrow).
VARSTOCASES
/MAKE run FROM BATCH_RUN_ID.1 TO BATCH_RUN_ID.50 /drop throwthisrow.
EXECUTE .
To complete your present approach you can use spssinc select variables extention command (see examples of use here and here and here). You will use it to automatically create a list of the variables you want to name in your varstocases command, so that the syntax will automatically adapt itself to the number of runs in the data:
So after varstocases and match files:
spssinc select variables macroname="!from" /properties pattern = "BATCH_RUN_ID".
VARSTOCASES /MAKE run FROM !from .

What is (!). in kdb and are below usecases valid to use it?

What is (!). called in kdb?
and are below use cases valid to use (!). to convert a list to a dictionary or are there better ways and other uses of (!). ?
Example:
q)(!). (`A`B;(`C`D`E;`F`G`H));
q).[(!);flip (`A`B;`C`D;`E`F)]
I cannot find any documentation on the use cases on (!). in kdb tutorials. Please share any information on (!). and its uses?
It's a version of apply & yep your use case is valid. The reason the operator is wrapped in parentheses is because it itself is a dyadic infix operator as is dot apply (.)
If you attempt to apply it as is, your expression is like so, which Q doesn't like
// infixOp infixOp operand
q)+ . 4 5
'
[0] + . 4 5
^
Wrapping the operator within parentheses effectively transforms it so the expression now becomes
// operand infixOp operand
q)(+). 4 5
9
If you define a function which can't be used infix, then there's no need to wrap it
q)f:+
q)4 f 5
'type
[0] 4 f 5
^
q)f . 4 5
9
If using apply with bracket notation as in your example, there's no need to wrap the function
q).[+;4 5]
9
https://code.kx.com/q/ref/apply/#apply-index
https://code.kx.com/q/basics/syntax/#parentheses-around-a-function-with-infix-syntax
Jason
In terms of use-cases, I find it very useful when defining dictionaries/tables as configs particularly when dictionaries are too wide (horizontal) for the screen or when it's more useful to see fields/mappings vertically as pairs. From a code/script point of view that is.
For example:
mapping:(!) . flip(
(`one; 1);
(`two; 2);
(`three; 3));
is much easier to read when scanning through a q script than
mapping2:`one`two`three!1 2 3
when the latter gets very wide.
It makes no difference to the actual dictionary of course because as Jason pointed out it's the same thing.

Applying of projections and monadic functions in k

How to properly apply a monadic functions and projections in k?
KDB+ 3.6 2018.05.17 Copyright (C) 1993-2018 Kx Systems
q) \
(5*;10*)#\:2
10 20
({x};{x*x})#\:2
2 4
(#;#)#\:2
(#[2];#[2])
Why 2 first examples work properly and the last one doesn't? I thought it would be:
(#;#)#\:2
1 1
but it gives me a strange result.
# (take) is a diadic function, unlike count which is monadic. This is why you were getting a projection when applying only a single argument to it.
q)count
#:
q)type (count)
101h
q)type (#)
102h
You can use the . (dot-apply) operator on diadic functions with two operands to return a result that is not a projection.
(#;#) .\: (3;til 10)
0 1 2
0 1 2
Got it!
q)\
(#;#)#\:2
(#[2];#[2])
(#:;#:)#\:2
1 1
For the purpose of completeness, this relates to unary forms which is documented here: https://code.kx.com/q/basics/exposed-infrastructure/#unary-forms

Range operator [3..max?] for selecting elements from an array [duplicate]

How can I get the array element range of first to second last?
For example,
$array = 1,2,3,4,5
$array[0] - will give me the first (1)
$array[-2] - will give me the second last (4)
$array[0..2] - will give me first to third (1,2,3)
$array[0..-2] - I'm expecting to get first to second last (1,2,3,4) but I get 1,5,4 ???
I know I can do long hand and go for($x=0;$x -lt $array.count;$x++), but I was looking for the square bracket shortcut.
You just need to calculate the end index, like so:
$array[0..($array.length - 2)]
Do remember to check that you actually have more than two entries in your array first, otherwise you'll find yourself getting duplicates in the result.
An example of such a duplicate would be:
#(1)[0..-1]
Which, from an array of a single 1 gives the following output
1
1
There might be a situation where you are processing a list, but you don't know the length. Select-object has a -skiplast parameter.
(1,2,3,4,5 | select -skiplast 2)
1
2
3
As mentioned earlier the best solution here:
$array[0..($array.length - 2)]
The problem you met with $array[0..-2] can be explained with the nature of "0..-2" expression and the range operator ".." in PowerShell. If you try to evaluate just this part "0..-2" in PowerShell you will see that result will be an array of numbers from 0 to -2.
>> 0..-2
0
-1
-2
And when you're trying to do $array[0..-2] in PowerShell it's the same as if you would do $array[0,-1,-2]. That's why you get results as 1, 5, 4 instead of 1, 2, 3, 4.
It could be kind of counterintuitive at first especially if you have some Python or Ruby background, but you need to take it into account when using PowerShell.
Robert Westerlund answer is excellent.
This answer I just saw on the Everything you wanted to know about arrays page and wanted to try it out.
I like it because it seems to describe exactly what the goal is, end at one short of the upper bound.
$array[0..($array.GetUpperBound(0) - 1)]
1
2
3
4
I used this variation of your original attempt to uninstall all but the latest version from Get-InstalledModule. It's really short, but not perfect because if there are more than 9 items it still returns just 8, but you can put a larger negative number, though.
$array[-9..-2]
1
2
3
4

Unicode characters in Julia: star symbol

I am trying to simplify the notation in a couple of functions using some unicode characters. In one of this function I have tried to use the star symbol (\star) but I've got several errors and warnings.
Please have a look at the following working example:
a = [1 2 3; 4 5 6; 7 8 9]
- Gives: a 3×3 Array{Int64,2}
a⋆ = [1 2 3; 4 5 6; 7 8 9]
- Gives: ERROR: syntax: unexpected "="
Why the star symbol is not working when it is used as above? Does it have a designed functionality in Julia?
The ⋆ symbol parses as an infix operator:
julia> dump(parse("a⋆b"))
Expr
head: Symbol call
args: Array{Any}((3,))
1: Symbol ⋆
2: Symbol a
3: Symbol b
typ: Any
A case could be made for allowing ⋆ as a character in identifier names, but that would be a breaking change and so far we have generally parsed characters that are generally considered to be operator-like in the Unicode standard as operators with the appropriate precedence.