Understanding moving window calcs in kdb - kdb

I'm struggling to understand this q code programming idiom from the kx cookbook:
q)swin:{[f;w;s] f each { 1_x,y }\[w#0;s]}
q)swin[avg; 3; til 10]
0 0.33333333 1 2 3 4 5 6 7 8
The notation is confusing. Is there an easy way to break it down as a beginner?
I get that the compact notation for the function is probably equivalent to this
swin:{[f;w;s] f each {[x; y] 1_x, y }\[w#0;s]}
w#0 means repeat 0 w times (w is some filler for the first couple of observations?), and 1_x, y means join x, after dropping the first observation, to y. But I don't understand how this then plays out with f = avg applied with each. Is there a way to understand this easily?

http://code.kx.com/q/ref/adverbs/#converge-iterate
Scan (\) on a binary (two-param) function takes the first argument as the seed value - in this case 3#0 - and iterates through each of the items in the second list - in this case til 10 - applying the function (append new value, drop first).
q){1_x,y}\[3#0;til 10]
0 0 0
0 0 1
0 1 2
1 2 3
2 3 4
3 4 5
4 5 6
5 6 7
6 7 8
7 8 9
So now you have ten lists and you can apply a function to each list - in this case avg but it could be any other function that applies to a list
q)med each {1_x,y}\[3#0;til 10]
0 0 1 2 3 4 5 6 7 8f
q)
q)first each {1_x,y}\[3#0;til 10]
0 0 0 1 2 3 4 5 6 7
q)
q)last each {1_x,y}\[3#0;til 10]
0 1 2 3 4 5 6 7 8 9

Related

How do I do a scan over a table in KDB?

t: ([] a: til 10; b: til 10)
a b
---
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
I'm trying to get it to sum a, and upsert it, leaving b in place. I don't want to use q-sql for it.
I think it should be something like:
({x[`a]: x`a + y`a}\) t
I keep getting 'type errors though, on the indexing operation.
What am I doing wrong?
EDIT:
An even simpler example,
({x[`a]: 3}\) t
Same error. Expected result:
q)
a b
---
3 0
3 1
3 2
3 3
3 4
3 5
3 6
3 7
3 8
3 9
What am I'm trying to achieve in pseudocode:
assign case:
for each row in t
row[`a]: 3
For the summation case:
for each row in t
row[`a]: row_prior[`a] + row[`a]
I finally figured out the answer; you have to return after the assignment:
({y[`a]: y[`a]+x[`a]; y}\) t
a b
----
0 0
1 1
3 2
6 3
10 4
15 5
21 6
28 7
36 8
45 9

How do I convert a list into a matrix in KDB?

I have a list of the form:
1 2 3 4
I'd like to convert it into a square matrix:
1 2
3 4
Which I think would be:
(1 2;3 4)
What's the canonical way to do this for, for n sized matrices in KDB?
You can use take
q)n: 2
q)(n; n) # 1 2 3 4
1 2
3 4
or for an m x n matrix:
q)m: 2
q)n: 3
q)(m; n) # 1 2 3 4 5 6
1 2 3
4 5 6
You can do:
n cut list
e.g.
q)3 cut til prd 3 3
0 1 2
3 4 5
6 7 8
Edit:
To insert any list into the closest n*n matrix and fill proceeding positions with NA's you can do:
q)f:{a:(ceiling sqrt b:count x); a cut x,((a*a) - b)#0N}
q)/e.g.
q)f til 10
0 1 2 3
4 5 6 7
8 9
q)

What algorithm is used when rows in tables are sorted?

Let's assume that we have a table with two columns. The table contains data and our goal is to sort that table.
Assume our data looks like this, where y1 and y2 are the data in the columns.
You can produce that plot with MATLAB or GNU Octave.
% Simulate the model
[t,y] = ode45(#odefunc,[0 20],[1; -2]);
% Plot the simulation
close all
plot(t,y(:,1),'-r',t,y(:,2),'-b')
title('Solution of van der Pol Equation (\mu = 1) with ODE45');
xlabel('Time t');
ylabel('Solution y');
legend('y_1','y_2')
grid on
function dydt = odefunc(t,y)
dydt = [y(2); (1-0.1*y(1)^2)*y(2)-y(1) + 1];
end
If we look above the plot, we are going to se the data like this:
You can create that plot with this code:
% Plot 3D bar
figure
imagesc(y)
colorbar
Here we can see that the plot have a very much like a "table-look". My question is what algorithm is used when sorting the rows in the table so every row that looks almost the same, have it's own unique position in the table.
For example, if we have a table like this.
0 2 4
1 3 5
2 4 6
3 5 7
4 6 8
5 7 9
0 2 4
1 3 5
2 4 6
3 5 7
4 6 8
5 7 9
0 2 4
1 3 5
2 4 6
3 5 7
4 6 8
5 7 9
0 2 4
1 3 5
The code if you want to create that table.
j = 0;
rows = 20;
for i = 1:rows
disp(sprintf("%i %i %i", j, j+2, j+4))
j = j + 1;
if(j + 4 >= 10)
j = 0;
end
end
We can see that there are four rows of 0 2 4 and three rows of 5 7 9.
I want all rows 0 2 4 close to each other and all rows 5 7 9 close to each other. And.... 0 2 4 cannot be after 5 7 9 because then the plot would look terrible.
For example, assume that we begining with row 1, the first row 0 2 4. Then we are looking for the same rows of 0 2 4 and let's say we found four rows 0 2 4. Then we sort them.
0 2 4
0 2 4
0 2 4
0 2 4
Now next row would be 1 3 5 and we find two rows of 1 3 5. We sorting them.
0 2 4
0 2 4
0 2 4
0 2 4
1 3 5
1 3 5
After we have sorted for a while, we are going to have a table like this.
0 2 4
0 2 4
0 2 4
0 2 4
1 3 5
1 3 5
2 4 6
2 4 6
2 4 6
2 4 6
3 5 7
3 5 7
3 5 7
.
.
.
.
5 7 9
5 7 9
5 7 9
And now, we found 1 2 4, which is very similar to 0 2 4. So we need to place 1 2 4 close to 0 2 4, perhaps between 0 2 4 or 1 3 5 or after 0 2 4 or before 0 2 4. How do I even know that 1 2 4 should be placed close to 0 2 4? That's the issue!!!.
How can I sort that?
I need to do that in C-programming language because speed is most important here, but I think I will start to do it in GNU Octave. I'm pretty sure that there is a SQL-sorting algorithm I'm looking for.
Notice in practice, there are numbers, integers, 10-bit e.g values between 0-1023.

KDB/Q how do we calculate the moving median

There are already moving average in kdb/q.
https://code.kx.com/q/ref/avg/#mavg
But how do I compute moving median?
Here is a naive approach. It starts with an empty list and null median and iterates over the list feeding in a new value each time.
Sublist is used fix the window, and this window is passed along with the median as the state of into the next iteration.
At the end scan \ will output the state at every iteration from which we take the median (first element) from each one
mmed:{{(med l;l:neg[x] sublist last[y],z)}[x]\[(0n;());y][;0]}
q)mmed[5;til 10]
0 0.5 1 1.5 2 3 4 5 6 7
q)i:4 9 2 7 0 1 9 2 1 8
q)mmed[3;i]
4 6.5 4 7 2 1 1 2 2 2
There's also a generic "sliding window" function here which you can pass your desired aggregator into: https://code.kx.com/q/kb/programming-idioms/#how-do-i-apply-a-function-to-a-sequence-sliding-window
q)swin:{[f;w;s] f each { 1_x,y }\[w#0;s]}
q)swin[avg; 3; til 10]
0 0.33333333 1 2 3 4 5 6 7 8
q)update newcol:swin[med;10;mycol] from tab

Matlab(the same cell in different matrix)

I have two matrix A and B. Suppose I would like to find in each row of matrix A the smallest number, and for the same cell that this number is in Matrix A, do find the corresponding number of the same cell in matrix B. For example the number in matrix A will be in the position A(1,3), A(2,9)...and I want the corresponding number in B(1,3), B(2,9)... Is it possible to do it, or I am asking something hard for matlab. Hope someone will help me.
What you can do is use min and find the minimum across all of the rows for each column. You would actually use the second output in order to find the location of each column per row that you want to find. Once you locate these, simply use sub2ind to access the corresponding values in B. As such, try something like this:
[~,ind] = min(A,[],2);
val = B(sub2ind(size(A), (1:size(A,1)).', ind));
val would contain the output values in the matrix B which correspond to the same positions as the minimum values of each row in A. This is also assuming that A and B are the same size. As an illustration, here's an example. Let's set A and B to be a random 4 x 4 array of integers each.
rng(123);
A = randi(10, 4, 4)
B = randi(10, 4, 4)
A =
7 8 5 5
3 5 4 1
3 10 4 4
6 7 8 8
B =
2 7 8 3
2 9 4 7
6 8 4 1
6 7 3 5
By running the first line of code, we get this:
[~,ind] = min(A,[],2)
ind =
3
4
1
1
This tells us that the minimum value of the first row is the third column, the minimum value of the next row is the 4th column, and so on and so forth. Once we have these column numbers, let's access what the corresponding values are in B, so we would want row and columns (1,3), (2,4), etc. Therefore, after running the second statement, we get:
val = B(sub2ind(size(A), (1:size(A,1)).', ind))
val =
8
7
6
6
If you quickly double check the accessed positions in B in comparison to A, we have found exactly those spots in B that correspond to A.
A = randi(9,[5 5]);
B = randi(9,[5 5]);
[C,I] = min(A');
B.*(A == repmat(C',1,size(A,2)))
example,
A =
2 1 6 9 1
2 4 4 4 2
5 6 5 5 5
9 3 9 3 6
4 5 6 8 3
B =
3 5 6 8 1
9 2 9 7 1
5 6 6 5 6
4 6 1 4 5
5 3 7 1 9
ans =
0 5 0 0 1
9 0 0 0 1
5 0 6 5 6
0 6 0 4 0
0 0 0 0 9
You can use it like,
B(A == repmat(C',1,5))
ans =
9
5
5
6
6
5
4
1
1
6
9