Vectorization of multiple embedded for loops - matlab

I have the following code that includes 3 iterated for loops in order to create an upper diagonal matrix, I plan on performing on large data set many times and want to make as computationally efficient as possible.
data = magic(3);
n = size(data,1);
W = zeros(n,n);
for i = 1:n
for j = i:n
if i==j
W(i,j)=0;
else
for k = 1:n
temp(1,k) = (data(i,k)-data(j,k))^2;
sumTemp = sumTemp + temp(1,k);
end
W(i,j)=sqrt(sumTemp);
end
temp = 0;
sumTemp = 0;
end
end
Answer should look like:
[0 6.4807 9.7980
0 0 6.4807
0 0 0]
I am working it hard right now, but figure I would throw it out there in case anyone has any suggestions that would save me hours of fiddling around.

This is hat I have at the moment:
data = magic(3);
n = size(data,1);
W = zeros(n,n);
for i = 1:n
for j = i+1:n
W(i,j)= norm(data(i,:)-data(j,:))
%W(i,j)= sqrt(sum((data(i,:)-data(j,:)).^2));
end
end
What I did:
vecorized the inner loop
removed www, which is unused
changed 2nd loop, start at i+1 because nothing is done for i=j
Replaced sqrt((a-b).^2) with norm(a-b)
And now the "full" vectorization:
data = magic(3);
n = size(data,1);
W = zeros(n,n);
tri=triu(ones(n,n),1)>0;
[i,j]=find(tri);
W(tri)=arrayfun(#(i,j)norm(data(i,:)-data(j,:)),i,j)

Here is a straightforward solution with bsxfun:
Wfull = sqrt(squeeze(sum(bsxfun(#minus,data,permute(data,[3 2 1])).^2,2)))
W = triu(Wfull)
Use this where data is N-by-D, where N is the number of points and D is dimensions. For example,
>> data = magic(3);
>> triu(sqrt(squeeze(sum(bsxfun(#minus,data,permute(data,[3 2 1])).^2,2))))
ans =
0 6.4807 9.7980
0 0 6.4807
0 0 0
>> data = magic(5); data(:,end-1:end)=[]
data =
17 24 1
23 5 7
4 6 13
10 12 19
11 18 25
>> triu(sqrt(squeeze(sum(bsxfun(#minus,data,permute(data,[3 2 1])).^2,2))))
ans =
0 20.8087 25.2389 22.7376 25.4558
0 0 19.9499 19.0263 25.2389
0 0 0 10.3923 18.3576
0 0 0 0 8.5440
0 0 0 0 0
>>

Related

Performance of vectorizing code to create a sparse matrix with a single 1 per row from a vector of indexes

I have a large column vector y containing integer values from 1 to 10. I wanted to convert it to a matrix where each row is full of 0s except for a 1 at the index given by the value at the respective row of y.
This example should make it clearer:
y = [3; 4; 1; 10; 9; 9; 4; 2; ...]
% gets converted to:
Y = [
0 0 1 0 0 0 0 0 0 0;
0 0 0 1 0 0 0 0 0 0;
1 0 0 0 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 1;
0 0 0 0 0 0 0 0 1 0;
0 0 0 0 0 0 0 0 1 0;
0 0 0 1 0 0 0 0 0 0;
0 1 0 0 0 0 0 0 0 0;
...
]
I have written the following code for this (it works):
m = length(y);
Y = zeros(m, 10);
for i = 1:m
Y(i, y(i)) = 1;
end
I know there are ways I could remove the for loop in this code (vectorizing). This post contains a few, including something like:
Y = full(sparse(1:length(y), y, ones(length(y),1)));
But I had to convert y to doubles to be able to use this, and the result is actually about 3x slower than my "for" approach, using 10.000.000 as the length of y.
Is it likely that doing this kind of vectorization will lead to better performance for a very large y? I've read many times that vectorizing calculations leads to better performance (not only in MATLAB), but this kind of solution seems to result in more calculations.
Is there a way to actually improve performance over the for approach in this example? Maybe the problem here is simply that acting on doubles instead of ints isn't the best thing for comparison, but I couldn't find a way to use sparse otherwise.
Here is a test to comapre:
function [t,v] = testIndicatorMatrix()
y = randi([1 10], [1e6 1], 'double');
funcs = {
#() func1(y);
#() func2(y);
#() func3(y);
#() func4(y);
};
t = cellfun(#timeit, funcs, 'Uniform',true);
v = cellfun(#feval, funcs, 'Uniform',false);
assert(isequal(v{:}))
end
function Y = func1(y)
m = numel(y);
Y = zeros(m, 10);
for i = 1:m
Y(i, y(i)) = 1;
end
end
function Y = func2(y)
m = numel(y);
Y = full(sparse(1:m, y, 1, m, 10, m));
end
function Y = func3(y)
m = numel(y);
Y = zeros(m,10);
Y(sub2ind([m,10], (1:m).', y)) = 1;
end
function Y = func4(y)
m = numel(y);
Y = zeros(m,10);
Y((y-1).*m + (1:m).') = 1;
end
I get:
>> testIndicatorMatrix
ans =
0.0388
0.1712
0.0490
0.0430
Such a simple for-loop can be dynamically JIT-compiled at runtime, and would run really fast (even slightly faster than vectorized code)!
It seems you are looking for that full numeric matrix Y as the output. So, you can try this approach -
m = numel(y);
Y1(m,10) = 0; %// Faster way to pre-allocate zeros than using function call `zeros`
%// Source - http://undocumentedmatlab.com/blog/preallocation-performance
linear_idx = (y-1)*m+(1:m)'; %//'# since y is mentioned as a column vector,
%// so directly y can be used instead of y(:)
Y1(linear_idx)=1; %// Y1 would be the desired output
Benchmarking
Using Amro's benchmark post and increasing the datasize a bit -
y = randi([1 10], [1.5e6 1], 'double');
And finally doing the faster pre-allocation scheme mentioned earlier of using Y(m,10)=0; instead of Y = zeros(m,10);, I got these results on my system -
>> testIndicatorMatrix
ans =
0.1798
0.4651
0.1693
0.1457
That is the vectorized approach mentioned here (the last one in the benchmark suite) is giving you more than 15% performance improvement over your for-loop code (the first one in the benchmark suite). So, if you are using large datasizes and intend to get full versions of sparse matrices, this approach would make sense (in my personal opinion).
Does something like this not work for you?
tic;
N = 1e6;
y = randperm( N );
Y = spalloc( N, N, N );
inds = sub2ind( size(Y), y(:), (1:N)' );
Y = sparse( 1:N, y, 1, N, N, N );
toc
The above outputs
Elapsed time is 0.144683 seconds.

Assigning indexes in MatLab using sub2ind

I have 3 data sets, two with coordinates and one with data with the length of n with a loop I would assign the data in this way
MAT = zeros(m, n);
for i = 1:n
MAT(Z(i), X(i)) = MAT(Z(i), X(i)) + DATA(i);
end
I want to do it without a loop since what I am trying to do is something like:
MAT = zeros(m, n);
mn = size(MAT);
MAT(sub2ind(mn, Z, X)) = MAT(sub2ind(mn, Z, X)) + DATA;
Anyone has an idea how to make it properly and efficiently?
Cheers.
You should use the function accumarray, for example:
Let:
>> Z = [ 1 2 4 3 1];
>> X = [3 2 1 4 3];
>> D = [5 6 7 8 -10];
>> m = 4;n = 4;
Then we have:
>> MAT = accumarray([Z(:),X(:)],D(:),[m,n])
MAT =
0 0 -5 0
0 6 0 0
0 0 0 8
7 0 0 0

matrix dimensions matlab

I Have my function below, the idea being that X is a 3x3 extract from T to be used in the loop, it correctly extracts the 3 rows but for some reason produces far too many columns, see example below.
function T = tempsim(rows, cols, topNsideTemp, bottomTemp, tol)
T = zeros(rows,cols);
T(1,:) = topNsideTemp;
T(:,1) = topNsideTemp;
T(:,rows) = topNsideTemp;
T(rows,:) = bottomTemp;
S = [0 1 0; 1 1 1; 0 1 0];
X = zeros(3,3);
A = zeros(3,3);
for ii = 2:(cols-1);
jj = 2:(rows-1);
X = T([(ii-1) ii (ii+1)], [(jj-1) jj (jj+1)])
A = X.*S;
T = (sum(sum(A)))/5
end
test sample
EDU>> T = tempsim(5,4,100,50,0)
X =
100 100 100 100 100 100 100 100 100
100 0 0 0 0 0 0 0 100
100 0 0 0 0 0 0 0 100
ans =
100 100 100 100 100 100 100 100 100
100 0 0 0 0 0 0 0 100
100 0 0 0 0 0 0 0 100
??? Error using ==> times
Matrix dimensions must agree.
Error in ==> tempsim at 14
A = X.*S;
any thoughts on how to fix this?
There's no need to preallocate X and A if you do a complete assignment anyway. Then, you replace T with a scalar inside the loop, which makes you run into problems in the next iteration. What I'm guessing you want could look something like this:
function T = tempsim(rows, cols, topNsideTemp, bottomTemp, tol)
T = zeros(rows,cols);
T(1,:) = topNsideTemp;
T(:,1) = topNsideTemp;
T(:,rows) = topNsideTemp;
T(rows,:) = bottomTemp;
S = [0 1 0; 1 1 1; 0 1 0];
for ii = 1:(cols-2);
for jj = 1:(rows-2);
X = T(ii:ii+2, jj:jj+2);
A = X.*S;
T(ii,jj) = (sum(sum(A)))/5;
end
end
Although I'm not sure if you really mean to do that – you're working on T while modifying it. As a wild guess, I suspect you might be looking for something like
conv2(T, S/5, 'same')
instead, perhaps after making your fixed-temp borders twice as thick and re-setting them after the call (since conv2 does zero-padding at the outer borders).
Here:
jj = 2:(rows-1);
X = T([(ii-1) ii (ii+1)], [(jj-1) jj (jj+1)])
jj becomes [2 3 4]
so X is
T([1 2 3], [ [2 3 4]-1 [2 3 4] [2 3 4]+1 ])
You probably missed a for loop.

MATLAB matrix not formatting correctly

I have some code below, and I cant seem to get the matrices formatted correctly. I have been trying to get the matrices to look more professional (close together) with \t and fprintf, but cant seem to do so. I am also having some trouble putting titles for each columns of the matrix. Any help would be much appreciated!
clear all
clc
format('bank')
% input file values %
A = [4 6 5 1 0 0 0 0 0; 7 8 4 0 1 0 0 0 0; 6 5 9 0 0 1 0 0 0; 1 0 0 0 0 0 -1 0 0; 0 1 0 0 0 0 0 -1 0; 0 0 1 0 0 0 0 0 -1];
b = [480; 600; 480; 24; 20; 25];
c = [3000 4000 4000 0 0 0 0 0 0];
% Starting xb %
xb = [1 2 3 4 5 6]
% Starting xn %
xn = [7 8 9]
cb = c(xb)
cn = c(xn)
% Get B from A %
B = A(:,xb)
% Get N from A %
N = A(:,xn)
% Calculate z %
z = ((cb*(inv(B))*A)-c)
% Calculate B^(-1) %
Binv = inv(B)
% Calculate RHS of row 0 %
RHS0 = cb*Binv*b
% Calculates A %
A = Binv*A
%STARTING Tableau%
ST = [z RHS0;A b]
for j=1:A
fprintf(1,'\tz%d',j)
end
q = 0
while q == 0
m = input('what is the index value of the ENTERING variable? ')
n = input('what is the index value of the LEAVING variable? ')
xn(xn==m)= n
xb(xb==n) = m
cb = c(xb)
cn = c(xn)
B = A(:,xb)
N = A(:,xn)
Tableuz = (c-(cb*(B^(-1))*A))
RHS0 = (cb*(B^(-1))*b)
TableuA = ((B^(-1))*A)
Tableub = ((B^(-1))*b)
CT = [Tableuz RHS0; TableuA Tableub];
disp(CT)
q = input('Is the tableau optimal? Y-1, N-0')
end
I didn't dig into what you are doing really deeply, but a few pointers:
* Put semicolons at the end of lines you don't want printing to the screen--it makes it easier to see what is happening elsewhere.
* Your for j=1:A loop only prints j. I think what you want is more like this:
for row = 1:size(A,1)
for column = 1:size(A,2)
fprintf('%10.2f', A(row,column));
end
fprintf('\n');
end
If you haven't used the Matlab debugger yet, give it a try; it makes a lot of these problems easier to spot. All you have to do to start it is to add a breakpoint to the file by clicking on the dash(-) next to the line numbers and starting the script. Quick web searches can turn up the solution very quickly too--someone else has usually already had any problem you're going to run into.
Good luck.
Try using num2str with a format argument of your desired precision. It's meant for converting matrices to strings. (note: this is different than mat2str which serializes matrices so they can be deserialized with eval)

How can I generate the following matrix in MATLAB?

I want to generate a matrix that is "stairsteppy" from a vector.
Example input vector: [8 12 17]
Example output matrix:
[1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1]
Is there an easier (or built-in) way to do this than the following?:
function M = stairstep(v)
M = zeros(length(v),max(v));
v2 = [0 v];
for i = 1:length(v)
M(i,(v2(i)+1):v2(i+1)) = 1;
end
You can do this via indexing.
A = eye(3);
B = A(:,[zeros(1,8)+1, zeros(1,4)+2, zeros(1,5)+3])
Here's a solution without explicit loops:
function M = stairstep(v)
L = length(v); % M will be
V = max(v); % an L x V matrix
M = zeros(L, V);
% create indices to set to one
idx = zeros(1, V);
idx(v + 1) = 1;
idx = cumsum(idx) + 1;
idx = sub2ind(size(M), idx(1:V), 1:V);
% update the output matrix
M(idx) = 1;
EDIT: fixed bug :p
There's no built-in function I know of to do this, but here's one vectorized solution:
v = [8 12 17];
N = numel(v);
M = zeros(N,max(v));
M([0 v(1:N-1)]*N+(1:N)) = 1;
M(v(1:N-1)*N+(1:N-1)) = -1;
M = cumsum(M,2);
EDIT: I like the idea that Jonas had to use BLKDIAG. I couldn't help playing with the idea a bit until I shortened it further (using MAT2CELL instead of ARRAYFUN):
C = mat2cell(ones(1,max(v)),1,diff([0 v]));
M = blkdiag(C{:});
A very short version of a vectorized solution
function out = stairstep(v)
% create lists of ones
oneCell = arrayfun(#(x)ones(1,x),diff([0,v]),'UniformOutput',false);
% create output
out = blkdiag(oneCell{:});
You can use ones to define the places where you have 1's:
http://www.mathworks.com/help/techdoc/ref/ones.html