Vectorization in Matlab to speed up expensive loop - matlab

How can I speed up the following MATLAB code, using vectorization? Right now the single line in the loop is taking hours to run for the case upper = 1e7.
Here is the commented code with sample output:
p = 8;
lower = 1;
upper = 1e1;
n = setdiff(lower:upper,primes(upper)); % contains composite numbers between lower + upper
x = ones(length(n),p); % Preallocated 2-D array of ones
% This loop stores the unique prime factors of each composite
% number from 1 to n, in each row of x. Since the rows will have
% varying lengths, the rows are padded with ones at the end.
for i = 1:length(n)
x(i,:) = [unique(factor(n(i))) ones(1,p-length(unique(factor(n(i)))))];
end
output:
x =
1 1 1 1 1 1 1 1
2 1 1 1 1 1 1 1
2 3 1 1 1 1 1 1
2 1 1 1 1 1 1 1
3 1 1 1 1 1 1 1
2 5 1 1 1 1 1 1
For example, the last row contains the prime factors of 10, if we ignore the ones. I have made the matrix 8 columns wide to account for the many prime factors of numbers up to 10 million.
Thanks for any help!

This is not vectorization, but this version of the loop will save about half of the time:
for k = 1:numel(n)
tmp = unique(factor(n(k)));
x(k,1:numel(tmp)) = tmp;
end
Here is a quick benchmark for this:
function t = getPrimeTime
lower = 1;
upper = 2.^(1:8);
t = zeros(numel(upper),2);
for k = 1:numel(upper)
n = setdiff(lower:upper(k),primes(upper(k))); % contains composite numbers between lower to upper
t(k,1) = timeit(#() getPrime1(n));
t(k,2) = timeit(#() getPrime2(n));
disp(k)
end
p = plot(log2(upper),log10(t));
p(1).Marker = 'o';
p(2).Marker = '*';
xlabel('log_2(range of numbers)')
ylabel('log(time (sec))')
legend({'getPrime1','getPrime2'})
end
function x = getPrime1(n) % the originel function
p = 8;
x = ones(length(n),p); % Preallocated 2-D array of ones
for k = 1:length(n)
x(k,:) = [unique(factor(n(k))) ones(1,p-length(unique(factor(n(k)))))];
end
end
function x = getPrime2(n)
p = 8;
x = ones(numel(n),p); % Preallocated 2-D array of ones
for k = 1:numel(n)
tmp = unique(factor(n(k)));
x(k,1:numel(tmp)) = tmp;
end
end

Here's another approach:
p = 8;
lower = 1;
upper = 1e1;
p = 8;
q = primes(upper);
n = setdiff(lower:upper, q);
x = bsxfun(#times, q, ~bsxfun(#mod, n(:), q));
x(~x) = inf;
x = sort(x,2);
x(isinf(x)) = 1;
x = [x ones(size(x,1), p-size(x,2))];
This seems to be faster than the other two options (but is uses more memory). Borrowing EBH's benchmarking code:
function t = getPrimeTime
lower = 1;
upper = 2.^(1:12);
t = zeros(numel(upper),3);
for k = 1:numel(upper)
n = setdiff(lower:upper(k),primes(upper(k)));
t(k,1) = timeit(#() getPrime1(n));
t(k,2) = timeit(#() getPrime2(n));
t(k,3) = timeit(#() getPrime3(n));
disp(k)
end
p = plot(log2(upper),log10(t));
p(1).Marker = 'o';
p(2).Marker = '*';
p(3).Marker = '^';
xlabel('log_2(range of numbers)')
ylabel('log(time (sec))')
legend({'getPrime1','getPrime2','getPrime3'})
grid on
end
function x = getPrime1(n) % the originel function
p = 8;
x = ones(length(n),p); % Preallocated 2-D array of ones
for k = 1:length(n)
x(k,:) = [unique(factor(n(k))) ones(1,p-length(unique(factor(n(k)))))];
end
end
function x = getPrime2(n)
p = 8;
x = ones(numel(n),p); % Preallocated 2-D array of ones
for k = 1:numel(n)
tmp = unique(factor(n(k)));
x(k,1:numel(tmp)) = tmp;
end
end
function x = getPrime3(n) % Approach in this answer
p = 8;
q = primes(max(n));
x = bsxfun(#times, q, ~bsxfun(#mod, n(:), q));
x(~x) = inf;
x = sort(x,2);
x(isinf(x)) = 1;
x = [x ones(size(x,1), p-size(x,2))];
end

Related

Cumulative count of unique element in Matlab array

Working with Matlab 2019b.
x = [10 10 10 20 20 30]';
How do I get a cumulative count of unique elements in x, which should look like:
y = [1 2 3 1 2 1]';
EDIT:
My real array is actually much longer than the example given above. Below are the methods I tested:
x = randi([1 100], 100000, 1);
x = sort(x);
% method 1: check neighboring values in one loop
tic
y = ones(size(x));
for ii = 2:length(x)
if x(ii) == x(ii-1)
y(ii) = y(ii-1) + 1;
end
end
toc
% method 2 (Wolfie): count occurrence of unique values explicitly
tic
u = unique(x);
y = zeros(size(x));
for ii = 1:numel(u)
idx = (x == u(ii));
y(idx) = 1:nnz(idx);
end
toc
% method 3 (Luis Mendo): triangular matrix
tic
y = sum(triu(x==x'))';
toc
Results:
Method 1: Elapsed time is 0.016847 seconds.
Method 2: Elapsed time is 0.037124 seconds.
Method 3: Elapsed time is 10.350002 seconds.
EDIT:
Assuming that x is sorted:
x = [10 10 10 20 20 30].';
x = sort(x);
d = [1 ;diff(x)];
f = find(d);
d(f) = f;
ic = cummax(d);
y = (2 : numel(x) + 1).' - ic;
When x is unsorted use this:
[s, is] = sort(x);
d = [1 ;diff(s)];
f = find(d);
d(f) = f;
ic = cummax(d);
y(is) = (2 : numel(s) + 1).' - ic;
Original Answer that only works on GNU Octave:
Assuming that x is sorted:
x = [10 10 10 20 20 30].';
x = sort(x);
[~, ic] = cummax(x);
y = (2 : numel(x) + 1).' - ic;
When x is unsorted use this:
[s, is] = sort(x);
[~, ic] = cummax(s);
y(is) = (2 : numel(s) + 1).' - ic;
You could loop over the unique elements, and set their indices to 1:n each time...
u = unique(x);
y = zeros(size(x));
for ii = 1:numel(u)
idx = (x == u(ii));
y(idx) = 1:nnz(idx);
end
This is a little inefficient because it generates an intermediate matrix, when actually only a triangular half is needed:
y = sum(triu(x==x.')).';
Here's a no-for-loop version. On my machine it's a bit faster than the previous working methods:
% if already sorted, can omit this first and last line
[s, is] = sort(x);
[u,~,iu] = unique(s);
c = accumarray(iu,1);
cs = cumsum([0;c]);
z = (1:numel(x))'-repelem(cs(1:end-1),c);
y(is) = z;

How can I do vectorization for this matlab "for loop"?

I have some matlab code as follow, constructing KNN similarity weight matrix.
[D,I] = pdist2(X, X, 'squaredeuclidean', 'Smallest', k+1);
D = D < threshold;
W = zeros(n, n);
for i=1:size(I,2)
W(I(:,i), i) = D(:,i);
W(i, I(:,i)) = D(:,i)';
end
I want to vectorize the for loop. I have tried
W(I) = D;
but failed to get the correct value.
I add test case here:
n = 5;
D = [
1 1 1 1 1
0 1 1 1 1
0 0 0 0 0
];
I = [
1 2 3 4 5
5 4 5 2 3
3 1 1 1 1
];
There are some undefined variables that makes it hard to check what it is doing, but this should do the same as your for loop:
D,I] = pdist2(X, X, 'squaredeuclidean', 'Smallest', k+1);
D = D < threshold;
W = zeros(n);
% set the diagonal values
W(sub2ind(size(X), I(1, :), I(1, :))) = D(1,:);
% set the other values
W(sub2ind(size(W), I(2, :), 1:size(I, 2))) = D(2, :);
W(sub2ind(size(W), 1:size(I, 2), I(2, :))) = D(2, :).';
I splited the directions, it works now with your test case.
A possible solution:
idx1 = reshape(1:n*n,n,n).';
idx2 = bsxfun(#plus,I,0:n:n*size(I,2)-1);
W=zeros(n,n);
W(idx2) = D;
W(idx1(idx2)) = D;
Here assumed that you repeatedly want to compute D and I so compute idx only one time and use it repeatedly.
n = 5;
idx1 = reshape(1:n*n,n,n).';
%for k = 1 : 1000
%[D,I] = pdist2(X, X, 'squaredeuclidean', 'Smallest', k+1);
%D = D < threshold;
idx2 = bsxfun(#plus,I,0:n:n*size(I,2)-1);
W=zeros(n,n);
W(idx2) = D;
W(idx1(idx2)) = D;
%end
But if n isn't constant and it varies in each iteration it is better to change the way idx1 is computed:
n = 5;
%for k = 1 : 1000
%n = randi([2 10]);%n isn't constant
%[D,I] = pdist2(X, X, 'squaredeuclidean', 'Smallest', k+1);
%D = D < threshold;
idx1 = bsxfun(#plus,(0:n:n^2-1).',1:size(I,2));
idx2 = bsxfun(#plus,I,0:n:n*size(I,2)-1);
W=zeros(n,n);
W(idx2) = D;
W(idx1(idx2)) = D;
%end
You can cut some corners with linear indices but if your matrices are big then you should only take the nonzero components of D. Following copies all values of D
W = zeros(n);
W(reshape(sub2ind([n,n],I,[1;1;1]*[1:n]),1,[])) = reshape(D,1,[]);

Matlab: iterate through image blocks

I would like to divide an image into 8 by 6 blocks and then from each block would like to get the average of red, green and blue values then store the average values from each block into an array. Say that if I have image divided into 4 blocks the result array would be:
A = [average_red, average_green, average_blue,average_red, ...
average_green, average_blue,average_red, average_green, ...
average_blue,average_red, average_green, average_blue,...
average_red, average_green, average_blue,]
The loop I have created looks very complicated, takes a long time to run and I'm not even sure if it's working properly or not as I have no clue how to check. Is there any simpler way to implement this.
Here is the loop:
[rows, columns, ~] = size(img);
[rows, columns, ~] = size(img);
rBlock = 6;
cBlock = 8;
NumberOfBlocks = rBlock * cBlock;
bRow = ceil(rows/rBlock);
bCol = ceil(columns/cBlock);
row = bRow;
col = bCol;
r = zeros(row*col,1);
g = zeros(row*col,1);
b = zeros(row*col,1);
n = 1;
cl = 1;
rw = 1;
for x = 1:NumberOfBlocks
for i = cl : col
for j = rw : row
% some code
end
end
%some code
if i == columns && j ~= rows
cl = 1;
rw = j - (bRow -1);
col = (col - col) + bCol;
row = row + bRaw;
elseif a == columns && c == rows
display('done');
else
cl = i + 1;
rw = j - (bRow -1);
col = col + col;
row = row + row;
end
end
Because there are only 48 block, you may use simple for loop iterating blocks. (I think it's going to be fast enough).
Here is my code:
%Build test image
img = double(imresize(imread('peppers.png'), [200, 300]));
[rows, columns, ~] = size(img);
rBlock = 6;
cBlock = 8;
NumberOfBlocks = rBlock * cBlock;
bRow = ceil(rows/rBlock);
bCol = ceil(columns/cBlock);
idx = 1;
A = zeros(1, rBlock*cBlock*3);
for y = 0:rBlock-1
for x = 0:cBlock-1
%Block (y,x) boundaries: (x0,y0) to (x1,y1)
x0 = x*bCol+1;
y0 = y*bRow+1;
x1 = min(x0+bCol-1, columns); %Limit x1 to columns
y1 = min(y0+bRow-1, rows); %Limit y1 to rows
redMean = mean2(img(y0:y1, x0:x1, 1)); %Mean of red pixel in block (y,x)
greenMean = mean2(img(y0:y1, x0:x1, 2)); %Mean of green pixel in block (y,x)
blueMean = mean2(img(y0:y1, x0:x1, 3)); %Mean of blue pixel in block (y,x)
%Fill 3 elements of array A.
A(idx) = redMean;
A(idx+1) = greenMean;
A(idx+2) = blueMean;
%Advance index by 3.
idx = idx + 3;
end
end

How to oppositely order two vectors in Matlab?

I have the code below for oppositely ordering two vectors. It works, but I want to specify the line
B_diff(i) = B(i) - B(i+1);
to hold true not just for only
B_diff(i) = B(i) - B(i+1); but for
B_diff(i) = B(i) - B(i+k); where k can be any integer less than or equal to n. The same applies to "A". Any clues as to how I can achieve this in the program?
For example, I want to rearrange the first column of the matrix
A =
1 4
6 9
3 8
4 2
such that, the condition should hold true not only for
(a11-a12)(a21-a22)<=0;
but also for all
(a11-a13)(a21-a23)<=0;
(a11-a14)(a21-a24)<=0;
(a12-a13)(a22-a23)<=0;
(a12-a14)(a22-a24)<=0; and
(a13-a14)(a23-a24)<=0;
## MATLAB CODE ##
A = xlsread('column 1');
B = xlsread('column 2');
n = numel(A);
B_diff = zeros(n-1,1); %Vector to contain the differences between the elements of B
count_pos = 0; %To count the number of positive entries in B_diff
for i = 1:n-1
B_diff(i) = B(i) - B(i+1);
if B_diff(i) > 0
count_pos = count_pos + 1;
end
end
A_desc = sort(A,'descend'); %Sort the vector A in descending order
if count_pos > 0 %If B_diff contains positive entries, divide A_desc into two vectors
A_less = A_desc(count_pos+1:n);
A_great = sort(A_desc(1:count_pos),'ascend');
A_new = zeros(n,1); %To contain the sorted elements of A
else
A_new = A_desc; %This is then the sorted elements of A
end
if count_pos > 0
A_new(1) = A_less(1);
j = 2; %To keep track of the index for A_less
k = 1; %To keep track of the index for A_great
for i = 1:n-1
if B_diff(i) <= 0
A_new(i+1) = A_less(j);
j = j + 1;
else
A_new(i+1) = A_great(k);
k = k + 1;
end
end
end
A_diff = zeros(n-1,1);
for i = 1:n-1
A_diff(i) = A_new(i) - A_new(i+1);
end
diff = [A_diff B_diff]
prod = A_diff.*B_diff
The following code orders the first column of A opposite to the order of the second column.
A= [1 4; 6 9; 3 8; 4 2]; % sample matrix
[~,ix]=sort(A(:,2)); % ix is the sorting permutation of A(:,2)
inverse=zeros(size(ix));
inverse(ix) = numel(ix):-1:1; % the un-sorting permutation, reversed
B = sort(A(:,1)); % sort the first column
A(:,1)=B(inverse); % permute the first column according to inverse
Result:
A =
4 4
1 9
3 8
6 2

How to do a ~= vector operation in matlab

I'm trying to write my own program to sort vectors in matlab. I know you can use the sort(A) on a vector, but I'm trying to code this on my own. My goal is to also sort it in the fewest amount of swaps which is kept track of by the ctr variable. I find and sort the min and max elements first, and then have a loop that looks at the ii minimum vector value and swaps it accordingly.
This is where I start to run into problems, I'm trying to remove all the ii minimum values from my starting vector but I'm not sure how to use the ~= on a vector. Is there a way do this this with a loop? Thanks!
clc;
a = [8 9 13 3 2 8 74 3 1] %random vector, will be function a once I get this to work
[one, len] = size(a);
[mx, posmx] = max(a);
ctr = 0; %counter set to zero to start
%setting min and max at first and last elements
if a(1,len) ~= mx
b = mx;
c = a(1,len);
a(1,len) = b;
a(1,posmx) = c;
ctr = ctr + 1;
end
[mn, posmn] = min(a);
if a(1,1) ~= mn
b = mn;
c = a(1,1);
a(1,1) = b;
a(1,posmn) = c;
ctr = ctr + 1;
end
ii = 2; %starting at 2 since first element already sorted
mini = [mn];
posmini = [];
while a(1,ii) < mx
[mini(ii), posmini(ii - 1)] = min(a(a~=mini))
if a(1,ii) ~= mini(ii)
b = mini(ii)
c = a(1,ii)
a(1,ii) = b
a(1,ii) = c
ctr = ctr + 1;
end
ii = ii + 1;
end