KDB/Q How to implement moving rank efficiently? - kdb

I am trying to implement a moving rank function, taking parameters of n, the number of items, and m, the column name. Here is how I implement it:
mwindow: k){[y;x]$[y>0;x#(!#x)+\:!y;x#(!#x)+\:(!-y)+y+1]};
mrank: {[n;x] sum each x > prev mwindow[neg n;x]};
But this seems to take quite some time if n is moderately large, say 100.
I figure it is because it has to calculate from scratch, unlike msum, which keeps a running variable and only calculate the difference between the newly added and the dropped.

There's a number of general sliding window functions here that you can use to generate rolling lists on which to apply your rank: https://code.kx.com/q/kb/programming-idioms/#how-do-i-apply-a-function-to-a-sequence-sliding-window
Those approaches seem to fill the lists out with zeros/nulls however which I think won't really suit your use of rank. Here's another possible approach which might be more suitable to rank (though I haven't tested this for performance on the large scale):
q)mwin:{x each (),/:{neg[x]sublist y,z}[y]\[z]}
q)update r:mwin[rank;4;c] from ([]c:10?100)
c r
----------
84 ,0
25 1 0
31 2 0 1
0 3 1 2 0
51 1 2 0 3
29 2 0 3 1
25 0 3 2 1
73 2 1 0 3
0 2 1 3 0
6 2 3 0 1
q)update r:last each mwin[rank;4;c] from ([]c:10?100)
c r
----
38 0
72 1
13 0
77 3
64 1
9 0
37 1
79 3
97 3
63 1
q)

Related

Is it possible to show a matrix as a tree in MATLAB?

This is a matrix and I want to know is it possible to show it as a tree in MATLAB?
It contains of two class (1 and 2) and every row would be a branch of a tree.
The zero in this matrix means there is nothing because of matrix length I used zero.
I used these functions on MATLAB but it seems I should write a custom one.
treeplot(A);
Matrix A:
10 13 0 1
10 13 22 1
10 22 0 1
13 22 0 1
17 26 0 1
4 12 0 2
4 12 15 2
4 15 0 2
7 12 0 2
7 12 15 2
7 15 0 2
12 15 0 2
12 15 17 2
15 17 0 2
As an example the first 4 lines of this matrix on the paper will be:

All unique multiplication products

I'd like to obtain all unique products for a given vector.
For example, given a:
a = [4,10,12,3,6]
I want to obtain a matrix that contains the results of:
4*10
4*12
4*3
4*6
10*12
10*3
10*6
12*3
12*6
3*6
Is there a short and/or quick way of doing this in MATLAB?
EDIT: a may contain duplicate numbers, giving duplicate products - and these must be kept.
Given:
a =
4 10 12 3 6
Construct the matrix of all pairwise products:
>> all_products = a .* a.'
all_products =
16 40 48 12 24
40 100 120 30 60
48 120 144 36 72
12 30 36 9 18
24 60 72 18 36
Now, construct a mask to keep only those values below the main diagonal:
>> mask = tril(true(size(all_products)), -1)
mask =
0 0 0 0 0
1 0 0 0 0
1 1 0 0 0
1 1 1 0 0
1 1 1 1 0
and apply the mask to the product matrix:
>> unique_products = all_products(mask)
unique_products =
40
48
12
24
120
30
60
36
72
18
If you have the Statistics Toolbox, you can abuse pdist, which considers only one of the two possible orders for each pair:
result = pdist(a(:), #times);
One option involves nchoosek, which returns all combinations of k elements out of a vector, each row is one combination. prod computes the product of rows or columns:
a = [4,10,12,3,6];
b = nchoosek(a,2);
b = prod(b,2); % 2 indicates rows
Try starting with this. Have the unique function filter out the result of multiplying a by itself.
b = unique(a*a')

Read a big data file with headlines into a matrix

I have a file that looks like this (with real data and much bigger):
A B C D E F G H I
1 105.28 1 22 84 2 10.55 21 2
2 357.01 0 32 34 1 11.43 28 1
3 150.23 3 78 22 0 12.02 11 0
4 357.01 0 32 34 1 11.43 28 1
5 357.01 0 32 34 1 11.43 28 1
6 357.01 0 32 34 1 11.43 28 1
...
17000 357.01 0 32 34 1 11.43 28 1
I want to import all the numerical value into a matrix, skipping the headlines. For that purpose I use this code:
Filename = 'test.txt';
A = dlmread(Filename,' ',1,0); %Imports the whole data into a matrix
The problem with this is just that A is a 17 000 * 1 vector instead of a matrix with several columns. If I manual edit the data file, remove the headlines and just run this it works:
A = dlmread(Filename); %Imports the whole data into a matrix
But I would prefer not to do this since the headlines are used later on in the code. Any advice how to get this work?
edit: solved by using
' '
instead of just
' '
Use the import tool.
Make sure you choose the data.
Generate script.

Matlab replace consecutive zero value with others value

I have this matrix:
A = [92 92 92 91 91 91 146 146 146 0
0 0 112 112 112 127 127 127 35 35
16 16 121 121 121 55 55 55 148 148
0 0 0 96 96 0 0 0 0 0
0 16 16 16 140 140 140 0 0 0]
How can I replace consecutive zero value with shuffled consecutive value from matrix B?
B = [3 3 3 5 5 6 6 2 2 2 7 7 7]
The required result is some matrix like this:
A = [92 92 92 91 91 91 146 146 146 0
6 6 112 112 112 127 127 127 35 35
16 16 121 121 121 55 55 55 148 148
7 7 7 96 96 5 5 3 3 3
0 16 16 16 140 140 140 2 2 2]
You simply can do it like this:
[M,N]=size(A);
for i=1:M
for j=1:N
if A(i,j)==0
A(i,j)=B(i+j);
end
end
end
If I understand it correctly from what you've described, your solution is going to need the following steps:
Loop over the rows of your matrix, e.g. for row = 1:size(A, 1)
Loop over the elements of each row, identify where each run of zeroes starts and store the indices and the length of the run. For example you might end up with a matrix like: consecutiveZeroes = [ 2 1 2 ; 4 1 3 ; 4 6 5 ; 5 8 3 ] indicating that you have a run at (2, 1) of length 2, a run at (4, 1) of length 3, a run at (4, 6) of length 5, and a run at (5, 8) of length 3.
Now loop over the elements of B counting up how many elements there are of each value. For example you might store this as replacementValues = [ 3 3 ; 2 5 ; 2 6 ; 3 2 ; 3 7 ] meaning three 3's, two 5's, two 6's etc.
Now take a row from your consecutiveZeroes matrix and randomly choose a row of replacementValues that specifies the same number of elements, replace the zeroes in A with the values from replacementValues, and delete the row from replacementValues to show that you've used it.
If there isn't a row in replacementValues that describes a long enough run of values to replace one of your runs of zeroes, find a combination of two or more rows from replacementValues that will work.
You can't do this with just a single pass through the matrix, because presumably you could have a matrix A like [ 15 7 0 0 0 0 0 0 3 ; 2 0 0 0 5 0 0 0 9 ] and a vector B like [ 2 2 2 3 3 3 7 7 5 5 5 5 ], where you can only achieve what you want if you use the four 5's and two 7's and not the three 2's and three 3's to substitute for the run of six zeroes, because you have to leave the 2's and 3's for the two runs of three zeroes in the next row. The easiest approach if efficiency is not critical would probably be to run the algorithm multiple times trying different random combinations until you get one that works - but you'll need to decide how many times to try before giving up in case the input data actually has no solution.
If you get stuck on any of these steps I suggest asking a new, more specific question.

How to set an indexed value in a matrix based on another matrix's values

Say I have a matrix A
A =
0 1 2
2 1 1
3 1 2
and another matrix B
B =
0 42
1 24
2 32
3 12
I want to replace each value in A by the one associated to it in B.
I would obtain
A =
42 24 32
32 24 24
12 24 32
How can I do that without loops?
There are several ways to accomplish this, but here is an short one:
[~,ind]=ismember(A,B(:,1));
Anew = reshape(B(ind,2),size(A))
If you can assume that the first column of B is always 0:size(B,1)-1, then it is easier, becoming just reshape(B(A+1,2),size(A)).
arrayfun(#(x)(B(find((x)==B(:,1)),2)),A)