AVX512 or AVX2 vector permutations, group uint16 elements by strides of 16 (32 bytes) - x86-64

I have 64 two-byte (short) numbers in memory like this: 0 1 2 3 ... 63. I want to shuffle them so that they look like this in memory:
0 16 32 48 1 17 33 49 2 18 34 50 ... 15 31 47 63
What is the most effective way to do this using avx2 or avx512?

Related

How to apply an "interface" method to a set of rows in kdb?

Sorry if this is a newbie question again.
I am trying to replicate the functionality of interfaces as seen in c++, rust etc. in kdb as is shown in a simple demonstration below:
q).iface.a.fun:{x*y+z}
q).iface.b.fun:{x*x+y+z}
q)ifaces:`a`b; // for demonstration purposes
q)tab:([]time:`datetime$();kind:`ifaces$();x:`long$();y:`long$();z:`long$());
q)n:10;
q)tab,:flip(n#.z.z;n?ifaces;n?10;n?10;n?10)
Now you would assume that the kind would be able to reference the `a`b fun methods of the iface interface as follows:
q)?[`tab;();0b;`max`ifaceval!((max;`x);(`.iface;`kind;`fun;`x;`y;`z))]
evaluation error:
fun
[0] ?[`tab;();0b;`max`ifaceval!((max;`x);(`.iface;`kind;`fun;`x;`y;`z))]
^
Obviously the functional nature of the select inhibits referencing the fun method on account of the symbol type field declarations.
You can avert this error by using enlist as follows:
q)?[`tab;();0b;`max`ifaceval!((max;`x);(`.iface;`kind;enlist`fun;`x;`y;`z))]
max ifaceval ..
-----------------------------------------------------------------------------..
9 77 154 95 65 0 128 153 126 60 49 77 154 95 65 0 128 153 126 60 49 77 154 ..
However this duplicates the result of fun for each row.
How might one effectively go about this without getting the above malformed responses?
Thanks again.
Selecting ifaceval first will ensure each row is returned. max x is a scalar, which forces all the ifaceval entries into one row. The scalar will be expanded across all rows if a vector column precedes it.
q)?[`tab;();0b;`ifaceval`max!((`.iface;`kind;enlist`fun;`x;`y;`z);(max;`x))]
ifaceval max
-------------------------------------
160 11 126 28 32 60 76 10 112 168 8
96 10 77 24 16 35 60 6 63 104 8
96 10 77 24 16 35 60 6 63 104 8
96 10 77 24 16 35 60 6 63 104 8
96 10 77 24 16 35 60 6 63 104 8
160 11 126 28 32 60 76 10 112 168 8
96 10 77 24 16 35 60 6 63 104 8
160 11 126 28 32 60 76 10 112 168 8
160 11 126 28 32 60 76 10 112 168 8
160 11 126 28 32 60 76 10 112 168 8
I'm not sure if this is exactly what you're looking for though. If you want to calculate ifaceval for each row in the table, this should work.
q)?[tab;();0b;`ifaceval`max!(((';(`.iface;::;enlist`fun));`kind;`x;`y;`z);(max;`x))]
ifaceval max
------------
160 8
10 8
77 8
24 8
16 8
60 8
60 8
10 8
112 8
168 8
One point to make is that it's probably best to avoid using kdb keywords for column names. Although it works in functional queries, it does not for qSQL ones.
q)select max:max x from tab
'assign
[0] select max:max x from tab
^

How can I interrupt a 'loop' in kdb?

numb is a list of numbers:
q))input
42 58 74 51 63 23 41 40 43 16 64 29 35 37 30 3 34 33 25 14 4 39 66 49 69 13..
31 41 39 27 9 21 7 25 34 52 60 13 43 71 10 42 19 30 46 50 17 33 44 28 3 62..
15 57 4 55 3 28 14 21 35 29 52 1 50 10 39 70 43 53 46 68 40 27 13 69 20 49..
3 34 11 53 6 5 48 51 39 75 44 32 43 23 30 15 19 62 64 69 38 29 22 70 28 40..
18 30 60 56 12 3 47 46 63 19 59 34 69 65 26 61 50 67 8 71 70 44 39 16 29 45..
I want to iterate through each row and calculate the sum of the first 2 and then 3 and then 4 numbers etc. If that sum is greater than 1000 I want to stop the iteration on that particualr row and jump on the next row and do the same thing. This is my code:
{[input]
tot::tot+{[x;y]
if[1000<sum x;:count x;x,y]
}/[input;input]
}each numb
My problem here is that after the count of x is added to tot the over keeps going on the same row. How can I exit over and jump on the next row?
UPDATE: (QUESTION STILL OPEN) I do appreciate all the answers so far but I am not looking for an efficient way to sum the first n numbers. My question is how do I break the over and jump on the next line. I would like to achieve the same thing as with those small scripts:
C++
for (int i = 0; i <= 100; i++) {
if (i = 50) { printf("for loop exited at: %i ", i); break; }
}
Python
for i in range(100):
if i == 50:
print(i);
break;
R
for(i in 1:100){
if(i == 50){
print(i)
break
}
}
I think this is what you are trying to accomplish.
sum {(x & sums y) ? x}[1000] each input
It takes a cumulative sum of each row and takes an element wise minimum between that sum and the input limit thereby capping the output at the limit like so:
q)(100 & sums 40 43 16 64 29)
40 83 99 100 100
It then uses the ? operator to find the first occurance of that limit (i.e the element where this limit was equaled or passed) adding one as it is 0 indexed. In the example the first 100 occurs after 3 elements. You might want add one to include the first element after the limit in the count.
q)40 83 99 100 100 ? 100
3
And then it sums this count over all rows of the input.
You could use coverage in this case to exit when you fail to satisfy a condition
https://code.kx.com/q/ref/adverbs/#converge-repeat
The first parameter would be a function that does your check based on the current value of x which will be the next value to be passed in the main function.
For your example ive made a projection using the main input line then increase the indexes of what i am summing each time:
q)numb
98 11 42 97 89 80 73 35 4 30
86 33 38 86 26 15 83 71 21 22
23 43 41 80 56 11 22 28 47 57
q){[input] {x+1}/[{100>sum (y+1)#x}[input;];0] }each numb
1 1 2
this returns the first index of each where running sum is over 100
However this isn't really an ideal use case of KDB
could instead be done with something like
(sums#/:numb) binr\: 100
maybe your real example makes more sense
You can use while loops in KDB although all KDB developers are generally too afraid of being openly mocked and laughed at for doing so
q){i:0;while[i<>50;i+:1];:"loop exited at ",string i}`
"loop exited at 50"
Kdb does have a "stop loop" mechanism but only in the case of a monadic function with single seed value
/keep squaring until number is no longer less than 1000, starting at 2
q){x*x}/[{x<1000};2]
65536
/keep dealing random numbers under 20 until you get an 18 (seed value 0 is irrelevant)
q){first 1?20}\[18<>;0]
0 19 17 12 15 10 18
However this doesn't really fit your use case and as other people have pointed out, this is not how you would/should solve this problem in kdb.

Octave: How can I vectorize this for-lop?

Can this for-loop be vectorized?
I want to be able to vectorize the for-loop of this code to obtain a matrix like "sample". Trying to vectorize I got the "sample2" matrix, however as you can see it does not show the values I want for each row due to the linear index when I take "data" as a matrix instead of as a vector.
close all; clear all; clc;
N=5; n=10; n1=2; n2=8;
rand('state', sum(100*clock));
choose=round(((n-1)*rand(N,n))+1);
data=choose.^2;
idx=choose(:,n1:n2);
for i=1:N
dat=data(i,:);
sample(i,:)=dat(idx(i,:));
end
%Trying to vectorize to get the same result
sample2(:,(n1:n2)-n1+1)=data(idx);
Results:
data =
36 64 64 25 81 4 100 36 49 25
4 4 1 16 4 16 81 16 100 64
36 81 36 25 16 16 1 64 49 4
36 64 49 49 25 36 100 64 81 64
1 16 16 49 64 49 81 4 16 64
idx =
8 8 5 9 2 10 6
2 1 4 2 4 9 4
9 6 5 4 4 1 8
8 7 7 5 6 10 8
4 4 7 8 7 9 2
sample =
36 36 81 49 64 25 4
4 4 16 4 16 100 16
49 16 16 25 25 36 64
64 100 100 25 36 64 64
49 49 81 4 81 16 16
sample2 =
81 81 1 64 4 16 64
4 36 36 4 36 64 36
64 64 1 36 36 36 81
81 4 4 1 64 16 81
36 36 4 81 4 64 4
Looks like you are trying to index row major. But Octave indexes column major. You can transpose your input to get the indicies right. Also, if you want to index into the second col, you can just add the length of the first column.
Try this:
data2 = data';
sample2 = data2([idx + [0:size(idx,1)-1]'*size(data,2)])
First line just transposes the matrix so you get what would be row indexing of the original.
Second line modifies the index matrix to be total index instead of row index by adding the length of the original data rows, then references the data to provide the result.

How to dynamically reshape matrix block-wise? [duplicate]

This question already has answers here:
Collapsing matrix into columns
(8 answers)
Closed 6 years ago.
Let's say I have A = [1:8; 11:18; 21:28; 31:38; 41:48] Now I would like to move everything from column 4 onward to the row position. How do I achieve this?
A =
1 2 3 4 5 6 7 8
11 12 13 14 15 16 17 18
21 22 23 24 25 26 27 28
31 32 33 34 35 36 37 38
41 42 43 44 45 46 47 48
to
A2 =
1 2 3 4
11 12 13 14
21 22 23 24
31 32 33 34
41 42 43 44
5 6 7 8
15 16 17 18
35 36 37 38
45 46 47 48
reshape doesn't seem to do the trick
Here's a vectorized approach with reshape and permute -
reshape(permute(reshape(a,size(a,1),4,[]),[1,3,2]),[],4)
Making it generic, we could introduce the number of columns as a parameter. Hence, let ncols be that one. So, the solution becomes -
ncols = 4
reshape(permute(reshape(a,size(a,1),ncols,[]),[1,3,2]),[],ncols)
Sample run -
>> a
a =
20 79 18 82 27 23 59 66 46 21 48 95
96 83 46 49 34 88 23 42 17 27 15 54
11 88 34 92 23 62 86 56 32 32 91 54
>> reshape(permute(reshape(a,size(a,1),4,[]),[1,3,2]),[],4)
ans =
20 79 18 82
96 83 46 49
11 88 34 92
27 23 59 66
34 88 23 42
23 62 86 56
46 21 48 95
17 27 15 54
32 32 91 54
More info on the intuition behind such a General idea for nd to nd transformation, which even though originally was meant for NumPy/Python, extends to any programming paradigm in general.
Use Matrix indexing!
B=[A(:,1:4);A(:,5:8)]
In a loop...
for ii=0:floor(size(A,2)/4)-1
B([1+5*ii:5*(ii+1)],:)=A(:,[1+4*ii:4*(ii+1)] );
end
One more... perhaps unoptimized way would be to decompose the matrix into cells row-wise, transpose the cell array then concatenate everything back together:
B = cell2mat(mat2cell(A, size(A, 1), 4 * ones((size(A, 2) / 4), 1)).');
The above first uses mat2cell to decompose the matrix into non-overlapping cells. Each cell has the same number of rows as A but the total number of columns is 4 and there are exactly size(A, 2) / 4 of them. As such, we need to indicate a vector of ones where each element is 4 and there are size(A, 2) / 4 of these to tell us the number of columns for each cell. This creates a row-wise cell array and so we transpose this cell array and merge all of the cells together into one final matrix with cell2mat.

Ask for MATLAB code to detect steady state of data

I have a vector of electrical power consumption data which consists of transient, steady and power off states. I would like to identify steady-state starting point by the following condition:
The 5 consecutive elements of the data have difference values between each adjacent element <= threshold value (for this case, let say =10 W)
The first element that meets the condition shows the starting point of steady-state.
Example:
data = [0 0 0 40 70 65 59 50 38 30 32 33 30 33 37 19 ...
0 0 0 41 73 58 43 34 25 39 33 38 34 31 35 38 19 0]
abs(diff(data)) = [0 0 40 30 15 7 9 12 8 3 2 1 3 4 18 19 ...
0 0 41 32 15 9 14 6 5 4 3 4 3 19 19 0]
The sequences of abs(diff(data)) that meet the condition are 8 3 2 1 3 and 6 5 4 3 4. Therefore, the output should show the 10th data element (=30) and 27th data element (=33) as starting point of steady-state (There are 2 times of steady-state detected).
How would I write MATLAB code for this scenario?
(PS: data = 0 shows power off state)
Here's one approach using nlfilter (if the function is not available, you can implement a sliding window yourself):
data = [0 0 0 40 70 65 59 50 38 30 32 33 30 33 37 19 0 0 0 41 73 58 43 34 25 39 33 38 34 31 35 38 19 0];
difs = abs(diff(data));
% Use sliding window to find windows of consecutive elements below threshold
steady = nlfilter(difs, [1, 5], #(x)all(x <= 10));
% Find where steady state starts (1) and ends (-1)
start = diff(steady);
% Return indices of starting steady state
ind = find(start == 1);