Vectorize octave/matlab codes - matlab

Following is the octave codes(part of kmeans)
centroidSum = zeros(K);
valueSum = zeros(K, n);
for i = 1 : m
for j = 1 : K
if(idx(i) == j)
centroidSum(j) = centroidSum(j) + 1;
valueSum(j, :) = valueSum(j, :) + X(i, :);
end
end
end
The codes work, is it possible to vectorize the codes?
It is easy to vectorize the codes without if statement,
but how could we vectorize the codes with if statement?

I assume the purpose of the code is to compute the centroids of subsets of a set of m data points in an n-dimensional space, where the points are stored in a matrix X (points x coordinates) and the vector idx specifies for each data point the subset (1 ... K) the point belongs to. Then a partial vectorization is:
centroid = zeros(K, n)
for j = 1 : K
centroid(j, :) = mean(X(idx == j, :));
end
The if is eliminated by indexing, in particular logical indexing: idx == j gives a boolean array which indicates which data points belong to subset j.
I think it might be possible to get rid of the second for-loop, too, but this would result in very convoluted, unintelligible code.

Brief introduction and solution code
This could be one fully vectorized approach based on -
accumarray: For accumulating summations as done for calulating valueSum. This also introduces a technique how one can use accumarray on a 2D matrix along a certain direction, which isn't possible in a straight-forward manner with it.
bsxfun: For calculating linear indices across all columns for matching row indices from idx.
Here's the implementation -
%// Store no. of columns in X for frequent usage later on
ncols = size(X,2);
%// Find indices in idx that are within [1:k] range, call them as labels
%// Also, find their locations in that range array, call those as pos
[pos,id] = ismember(idx,1:K);
labels = id(pos);
%// OR with bsxfun: [pos,labels] = find(bsxfun(#eq,idx(:),1:K));
%// Find all labels, i.e. across all columns of X
all_labels = bsxfun(#plus,labels(:),[0:ncols-1]*K);
%// Get truncated X corresponding to all indices matches across all columns
X_cut = X(pos,:);
%// Accumulate summations within each column based on the labels.
%// Note that accumarray doesn't accept matrices, so we were required
%// to create all_labels that had same labels within each column and
%// offsetted at constant intervals from consecutive columns
acc1 = accumarray(all_labels(:),X_cut(:));
%// Regularise accumulated array and reshape back to a 2D array version
acc1_reg2D = [acc1 ; zeros(K*ncols - numel(acc1),1)];
valueSum = reshape(acc1_reg2D,[],ncols);
centroidSum = histc(labels,1:K); %// Get labels counts as centroid sums
Benchmarking code
%// Datasize parameters
K = 5000;
n = 5000;
m = 5000;
idx = randi(9,1,m);
X = rand(m,n);
disp('----------------------------- With Original Approach')
tic
centroidSum1 = zeros(K,1);
valueSum1 = zeros(K, n);
for i = 1 : m
for j = 1 : K
if(idx(i) == j)
centroidSum1(j) = centroidSum1(j) + 1;
valueSum1(j, :) = valueSum1(j, :) + X(i, :);
end
end
end
toc, clear valueSum1 centroidSum1
disp('----------------------------- With Proposed Approach')
tic
%// ... Code from earlied mentioned section
toc
Runtime results
----------------------------- With Original Approach
Elapsed time is 1.235412 seconds.
----------------------------- With Proposed Approach
Elapsed time is 0.379133 seconds.

Not sure about its runtime performance but here's a non-convoluted vectorized implementation:
b = idx == 1:K;
centroids = (b' * X) ./ sum(b)';

Vectorizing the calculation makes a huge difference in performance. Benchmarking
The original code,
The partial vectorization from A. Donda and
The full vectorization from Tom,
gave me the following results:
Original Code: Elapsed time is 1.327877 seconds.
Partial Vectorization: Elapsed time is 0.630767 seconds.
Full Vectorization: Elapsed time is 0.021129 seconds.
Benchmarking code here:
%// Datasize parameters
K = 5000;
n = 5000;
m = 5000;
idx = randi(9,1,m);
X = rand(m,n);
fprintf('\nOriginal Code: ')
tic
centroidSum1 = zeros(K,1);
valueSum1 = zeros(K, n);
for i = 1 : m
for j = 1 : K
if(idx(i) == j)
centroidSum1(j) = centroidSum1(j) + 1;
valueSum1(j, :) = valueSum1(j, :) + X(i, :);
end
end
end
centroids = valueSum1 ./ centroidSum1;
toc, clear valueSum1 centroidSum1 centroids
fprintf('\nPartial Vectorization: ')
tic
centroids = zeros(K,n);
for k = 1:K
centroids(k,:) = mean( X(idx == k, :) );
end
toc, clear centroids
fprintf('\nFull Vectorization: ')
tic
centroids = zeros(K,n);
b = idx == 1:K;
centroids = (b * X) ./ sum(b)';
toc
Note, I added an extra line to the original code to element-wise divide valueSum1 by centroidSum1 to make the output of each type of code the same.
Finally, I know this isn't strictly an "answer", however I don't have enough reputation to add a comment, and I thought the benchmarking figures were useful to anyone who is learning MATLAB (like myself) and needs some extra motivation to master vectorization.

Related

Vectorize with Matlab Meshgrid in Chebfun

I am trying to use meshgrid in Matlab together with Chebfun to get rid of double for loops. I first define a quasi-matrix of N functions,
%Define functions of type Chebfun
N = 10; %number of functions
x = chebfun('x', [0 8]); %Domain
psi = [];
for i = 1:N
psi = [psi sin(i.*pi.*x./8)];
end
A sample calculation would be to compute the double sum $\sum_{i,j=1}^10 psi(:,i).*psi(:,j)$. I can achieve this using two for loops in Matlab,
h = 0;
for i = 1:N
for j = 1:N
h = h + psi(:,i).*psi(:,j);
end
end
I then tried to use meshgrid to vectorize in the following way:
[i j] = meshgrid(1:N,1:N);
h = psi(:,i).*psi(:,j);
I get the error "Column index must be a vector of integers". How can I overcome this issue so that I can get rid of my double for loops and make my code a bit more efficient?
BTW, Chebfun is not part of native MATLAB and you have to download it in order to run your code: http://www.chebfun.org/. However, that shouldn't affect how I answer your question.
Basically, psi is a N column matrix and it is your desire to add up products of all combinations of pairs of columns in psi. You have the right idea with meshgrid, but what you should do instead is unroll the 2D matrix of coordinates for both i and j so that they're single vectors. You'd then use this and create two N^2 column matrices that is in such a way where each column corresponds to that exact column numbers specified from i and j sampled from psi. You'd then do an element-wise multiplication between these two matrices and sum across all of the columns for each row. BTW, I'm going to use ii and jj as variables from the output of meshgrid instead of i and j. Those variables are reserved for the complex number in MATLAB and I don't want to overshadow those unintentionally.
Something like this:
%// Your code
N = 10; %number of functions
x = chebfun('x', [0 8]); %Domain
psi = [];
for i = 1:N
psi = [psi sin(i.*pi.*x./8)];
end
%// New code
[ii,jj] = meshgrid(1:N, 1:N);
%// Create two matrices and sum
matrixA = psi(:, ii(:));
matrixB = psi(:, jj(:));
h = sum(matrixA.*matrixB, 2);
If you want to do away with the temporary variables, you can do it in one statement after calling meshgrid:
h = sum(psi(:, ii(:)).*psi(:, jj(:)), 2);
I don't have Chebfun installed, but we can verify that this calculates what we need with a simple example:
rng(123);
N = 10;
psi = randi(20, N, N);
Running this code with the above more efficient solution gives us:
>> h
h =
8100
17161
10816
12100
14641
9216
10000
8649
9025
11664
Also, running the above double for loop code also gives us:
>> h
h =
8100
17161
10816
12100
14641
9216
10000
8649
9025
11664
If you want to be absolutely sure, we can have both codes run with the outputs as separate variables, then check if they're equal:
%// Setup
rng(123);
N = 10;
psi = randi(20, N, N);
%// Old code
h = 0;
for i = 1:N
for j = 1:N
h = h + psi(:,i).*psi(:,j);
end
end
%// New code
[ii,jj] = meshgrid(1:N, 1:N);
hnew = sum(psi(:, ii(:)).*psi(:, jj(:)), 2);
%// Check for equality
eql = isequal(h, hnew);
eql checks if both variables are equal, and we do get them as such:
>> eql
eql =
1

Vectorize loop to increase efficiency

I have a 3 for loops and I would like if possible to vectorize the two inner loops.
for t=1:size(datesdaily1)
for i=1:size(secids,1)
sum=0;
if inc(t,i)==1
for j=1:size(secids,1)
if inc(t,j)==1
sum=sum+weig1(t,j)*sqrt(Rates(t,j))*rhoneutral(i,j);
end
end
b(t,i)=sqrt(Rates(t,i))*sum/MRates(t,1);
end
end
end
Any idea on how to accomplish that? Here 'weig', 'inc' and 'Rates' are (size(datesdaily1) by size(secids,1)) matrixes and 'rhoneutral' is a (size(secids,1) by size(secids,1)) matrix.
I tried but I was not able to figure out how to do it ...
Actual full code:
for t=1:size(datesdaily1)
rho=NaN(size(secids,1),size(secids,1));
aux=datesdaily1(t,1);
windowlenght=252;
index=find(datesdaily==aux);
auxret=dailyret(index-windowlenght+1:index,:);
numerator=0;
denominator=0;
auxret(:,any(isnan(auxret))) = NaN;
rho = corr(auxret, 'rows','pairwise');
rho1 = 1 - rho;
w = weig1(t,:) .* sqrt(Rates(t,:));
x = w.' * w;
y = x .* rho;
z = x .* rho1;
numerator = numerator + nansum(nansum(y));
denominator = denominator + nansum(nansum(z));;
if not(denominator==0)
alpha(t,1)=-(MRates(t,1)-numerator)/denominator;
%Stocks included
inc(t,:)=not(isnan(weig1(t,:).*diag(rho)'.*Rates(t,:)));
rhoneutral=rho-alpha(t,1).*(1-rho);
for i=1:size(secids,1)
sum=0;
if inc(t,i)==1
for j=1:size(secids,1)
if inc(t,j)==1
sum=sum+weig1(t,j)*sqrt(Rates(t,j))*rhoneutral(i,j);
end
end
bet(t,i)=sqrt(Rates(t,i))*sum/MRates(t,1);
end
end
check(t,1)=nansum(weig1(t,:).*bet(t,:));
end
end
One vectorized approach using fast matrix multiplication in MATLAB -
%// Mask of valid calculations
mask = inc==1
%// Store square root of Rates which seem to be used rather than Rates itself
sqRates = sqrt(Rates)
%// Use mask to set invalid positions in weig1 and sqRates to zeros
weig1masked = weig1.*mask
sqRates = sqRates.*mask
%// Perform the sum calculations using matrix multiplication.
%// This is where the magic happens!!
sum_vals = (weig1masked.*sqRates)*rhoneutral' %//'
%// Perform the outermost loop calculations for the final output
b_vect = bsxfun(#rdivide,sum_vals.*sqRates,MRates)
Benchmarking
Here's a benchmark test specially dedicated to #Dmitry Grigoryev for the doubts put on vectorization for performance -
M = 200;
N = 200;
weig1 = rand(M,N);
inc = rand(M,N)>0.5;
Rates = rand(M,N);
rhoneutral = rand(N,N);
MRates = rand(M,1);
disp('--------------------------- With Original Approach')
tic
%// Code from the original approach
toc
disp('--------------------------- With DmitryGrigoryev Approach')
tic
%// Code from the DmitryGrigoryev's solution
toc
disp('--------------------------- With Much-Hated Vectorized Approach')
tic
%// Proposed matrix-multiplication approach in this solution
toc
Runtimes -
--------------------------- With Original Approach
Elapsed time is 0.104084 seconds.
--------------------------- With DmitryGrigoryev Approach
Elapsed time is 3.562170 seconds.
--------------------------- With Much-Hated Vectorized Approach
Elapsed time is 0.002058 seconds.
Posting runtimes for bigger datasizes might just be too embarrasing for loopy approches, way to go vectorization!!

Putting a matrix in to sparse form in linear time using matlab

I'm trying to put a matrix in sparse form, and so far I'm looping through the rows and columns checking every element to see if it's non zero. However, this seems to be order n^2, where n is the number of non zero elements in the matrix. Is there a way to do this order n?
Here is my code
function a = inputMatrix(X)
[m,n] = size(X);
k=1;
for i=1:m
for j=1:n
if X(i,j)~=0
X(i,j);
a(1,k) = struct('row',i,'column',j,'data',X(i,j));
k = k+1;
end
end
end
Here is my time testing
ri = zeros(250,250);
rj = zeros(250,250);
rk = zeros(100,100);
for i=1:10000
ri(i) = randi([-10000,10000],1,1);
end
tic;
inputMatrix(ri);
time(1) = toc;
for i=1:5000
rj(i) = randi([-10000,10000],1,1);
end
tic;
inputMatrix(rj);
time(2) = toc;
for i=1:1000
rk(i) = randi([-10000,10000],1,1);
end
tic;
inputMatrix(rj);
time(3) = toc;
Results:
time = 1.8009 0.5619 0.5545
no. non zero entires = 10 000, 5000, 1000
The results suggest there is a non linear relationship, and not order N
find gives you the indices of nonzero elements and sparse converts the matrix directly to a sparse matrix.

How to Build a Distance Matrix without a Loop (Vectorization)?

I have many points and I want to build distance matrix i.e. distance of every point with all of other points but I want to don't use from loop because take too time...
Is a better way for building this matrix?
this is my loop: for a setl with size: 10000x3 this method take a lot of my time :(
for i=1:size(setl,1)
for j=1:size(setl,1)
dist = sqrt((xl(i)-xl(j))^2+(yl(i)-yl(j))^2+...
(zl(i)-zl(j))^2);
distanceMatrix(i,j) = dist;
end
end
How about using some linear algebra? The distance of two points can be computed from the inner product of their position vectors,
D(x, y) = ∥y – x∥ = √ (
xT x + yT y – 2 xT y ),
and the inner product for all pairs of points can be obtained through a simple matrix operation.
x = [xl(:)'; yl(:)'; zl(:)'];
IP = x' * x;
d = sqrt(bsxfun(#plus, diag(IP), diag(IP)') - 2 * IP);
For 10000 points, I get the following timing results:
ahmad's loop + shoelzer's preallocation: 7.8 seconds
Dan's vectorized indices: 5.3 seconds
Mohsen's bsxfun: 1.5 seconds
my solution: 1.3 seconds
You can use bsxfun which is generally a faster solution:
s = [xl(:) yl(:) zl(:)];
d = sqrt(sum(bsxfun(#minus, permute(s, [1 3 2]), permute(s, [3 1 2])).^2,3));
You can do this fully vectorized like so:
n = numel(xl);
[X, Y] = meshgrid(1:n,1:n);
Ix = X(:)
Iy = Y(:)
reshape(sqrt((xl(Ix)-xl(Iy)).^2+(yl(Ix)-yl(Iy)).^2+(zl(Ix)-zl(Iy)).^2), n, n);
If you look at Ix and Iy (try it for like a 3x3 dataset), they make every combination of linear indexes possible for each of your matrices. Now you can just do each subtraction in one shot!
However mixing the suggestions of shoelzer and Jost will give you an almost identical performance performance boost:
n = 50;
xl = rand(n,1);
yl = rand(n,1);
zl = rand(n,1);
tic
for t = 1:100
distanceMatrix = zeros(n); %// Preallocation
for i=1:n
for j=min(i+1,n):n %// Taking advantge of symmetry
distanceMatrix(i,j) = sqrt((xl(i)-xl(j))^2+(yl(i)-yl(j))^2+(zl(i)-zl(j))^2);
end
end
d1 = distanceMatrix + distanceMatrix'; %'
end
toc
%// Vectorized solution that creates linear indices using meshgrid
tic
for t = 1:100
[X, Y] = meshgrid(1:n,1:n);
Ix = X(:);
Iy = Y(:);
d2 = reshape(sqrt((xl(Ix)-xl(Iy)).^2+(yl(Ix)-yl(Iy)).^2+(zl(Ix)-zl(Iy)).^2), n, n);
end
toc
Returns:
Elapsed time is 0.023332 seconds.
Elapsed time is 0.024454 seconds.
But if I change n to 500 then I get
Elapsed time is 1.227956 seconds.
Elapsed time is 2.030925 seconds.
Which just goes to show that you should always bench mark solutions in Matlab before writing off loops as slow! In this case, depending on the scale of your solution, loops could be significantly faster.
Be sure to preallocate distanceMatrix. Your loops will run much, much faster and vectorization probably isn't needed. Even if you do it, there may not be any further speed increase.
The latest versions (Since R2016b) of MATLAB support Implicit Broadcasting (See also noted on bsxfun()).
Hence the fastest way for distance matrix is:
function [ mDistMat ] = CalcDistanceMatrix( mA, mB )
mDistMat = sum(mA .^ 2).' - (2 * mA.' * mB) + sum(mB .^ 2);
end
Where the points are along the columns of the set.
In your case mA = mB.
Have a look on my Calculate Distance Matrix Project.

Vectorizing MATLAB function

I have double summation over m = 1:M and n = 1:N for polar point with coordinates rho, phi, z:
I have written vectorized notation of it:
N = 10;
M = 10;
n = 1:N;
m = 1:M;
rho = 1;
phi = 1;
z = 1;
summ = cos (n*z) * besselj(m'-1, n*rho) * cos(m*phi)';
Now I need to rewrite this function for accepting vectors (columns) of coordinates rho, phi, z. I tried arrayfun, cellfun, simple for loop - they work too slow for me. I know about "MATLAB array manipulation tips and tricks", but as MATLAB beginner I can't understand repmat and other functions.
Can anybody suggest vectorized solution?
I think your code is already well vectorized (for n and m). If you want this function to also accept an array of rho/phi/z values, I suggest you simply process the values in a for-loop, as I doubt any further vectorization will bring significant improvements (plus the code will be harder to read).
Having said that, in the code below, I tried to vectorize the part where you compute the various components being multiplied {row N} * { matrix N*M } * {col M} = {scalar}, by making a single call to the BESSELJ and COS functions (I place each of the row/matrix/column in the third dimension). Their multiplication is still done in a loop (ARRAYFUN to be exact):
%# parameters
N = 10; M = 10;
n = 1:N; m = 1:M;
num = 50;
rho = 1:num; phi = 1:num; z = 1:num;
%# straightforward FOR-loop
tic
result1 = zeros(1,num);
for i=1:num
result1(i) = cos(n*z(i)) * besselj(m'-1, n*rho(i)) * cos(m*phi(i))';
end
toc
%# vectorized computation of the components
tic
a = cos( bsxfun(#times, n, permute(z(:),[3 2 1])) );
b = besselj(m'-1, reshape(bsxfun(#times,n,rho(:))',[],1)'); %'
b = permute(reshape(b',[length(m) length(n) length(rho)]), [2 1 3]); %'
c = cos( bsxfun(#times, m, permute(phi(:),[3 2 1])) );
result2 = arrayfun(#(i) a(:,:,i)*b(:,:,i)*c(:,:,i)', 1:num); %'
toc
%# make sure the two results are the same
assert( isequal(result1,result2) )
I did another benchmark test using the TIMEIT function (gives more fair timings). The result agrees with the previous:
0.0062407 # elapsed time (seconds) for the my solution
0.015677 # elapsed time (seconds) for the FOR-loop solution
Note that as you increase the size of the input vectors, the two methods will start to have similar timings (the FOR-loop even wins on some occasions)
You need to create two matrices, say m_ and n_ so that by selecting element i,j of each matrix you get the desired index for both m and n.
Most MATLAB functions accept matrices and vectors and compute their results element by element. So to produce a double sum, you compute all elements of the sum in parallel by f(m_, n_) and sum them.
In your case (note that the .* operator performs element-wise multiplication of matrices)
N = 10;
M = 10;
n = 1:N;
m = 1:M;
rho = 1;
phi = 1;
z = 1;
% N rows x M columns for each matrix
% n_ - all columns are identical
% m_ - all rows are identical
n_ = repmat(n', 1, M);
m_ = repmat(m , N, 1);
element_nm = cos (n_*z) .* besselj(m_-1, n_*rho) .* cos(m_*phi);
sum_all = sum( element_nm(:) );