I have double summation over m = 1:M and n = 1:N for polar point with coordinates rho, phi, z:
I have written vectorized notation of it:
N = 10;
M = 10;
n = 1:N;
m = 1:M;
rho = 1;
phi = 1;
z = 1;
summ = cos (n*z) * besselj(m'-1, n*rho) * cos(m*phi)';
Now I need to rewrite this function for accepting vectors (columns) of coordinates rho, phi, z. I tried arrayfun, cellfun, simple for loop - they work too slow for me. I know about "MATLAB array manipulation tips and tricks", but as MATLAB beginner I can't understand repmat and other functions.
Can anybody suggest vectorized solution?
I think your code is already well vectorized (for n and m). If you want this function to also accept an array of rho/phi/z values, I suggest you simply process the values in a for-loop, as I doubt any further vectorization will bring significant improvements (plus the code will be harder to read).
Having said that, in the code below, I tried to vectorize the part where you compute the various components being multiplied {row N} * { matrix N*M } * {col M} = {scalar}, by making a single call to the BESSELJ and COS functions (I place each of the row/matrix/column in the third dimension). Their multiplication is still done in a loop (ARRAYFUN to be exact):
%# parameters
N = 10; M = 10;
n = 1:N; m = 1:M;
num = 50;
rho = 1:num; phi = 1:num; z = 1:num;
%# straightforward FOR-loop
tic
result1 = zeros(1,num);
for i=1:num
result1(i) = cos(n*z(i)) * besselj(m'-1, n*rho(i)) * cos(m*phi(i))';
end
toc
%# vectorized computation of the components
tic
a = cos( bsxfun(#times, n, permute(z(:),[3 2 1])) );
b = besselj(m'-1, reshape(bsxfun(#times,n,rho(:))',[],1)'); %'
b = permute(reshape(b',[length(m) length(n) length(rho)]), [2 1 3]); %'
c = cos( bsxfun(#times, m, permute(phi(:),[3 2 1])) );
result2 = arrayfun(#(i) a(:,:,i)*b(:,:,i)*c(:,:,i)', 1:num); %'
toc
%# make sure the two results are the same
assert( isequal(result1,result2) )
I did another benchmark test using the TIMEIT function (gives more fair timings). The result agrees with the previous:
0.0062407 # elapsed time (seconds) for the my solution
0.015677 # elapsed time (seconds) for the FOR-loop solution
Note that as you increase the size of the input vectors, the two methods will start to have similar timings (the FOR-loop even wins on some occasions)
You need to create two matrices, say m_ and n_ so that by selecting element i,j of each matrix you get the desired index for both m and n.
Most MATLAB functions accept matrices and vectors and compute their results element by element. So to produce a double sum, you compute all elements of the sum in parallel by f(m_, n_) and sum them.
In your case (note that the .* operator performs element-wise multiplication of matrices)
N = 10;
M = 10;
n = 1:N;
m = 1:M;
rho = 1;
phi = 1;
z = 1;
% N rows x M columns for each matrix
% n_ - all columns are identical
% m_ - all rows are identical
n_ = repmat(n', 1, M);
m_ = repmat(m , N, 1);
element_nm = cos (n_*z) .* besselj(m_-1, n_*rho) .* cos(m_*phi);
sum_all = sum( element_nm(:) );
Related
In the question, A is a square matrix and x is a vector. The non-zero entries of A can only be 1 or -1. Then how to speed up A*x in Matlab?
I asked this question, because A*x contains multiplication operations that are computation-expensive.
However, if A only contains some +1 and -1, and assume y is a row vector from A, then y*x can avoid multiplication operations. You just need group the entries of x into 2, one is corresponding to the entries +1 of y, and the other one is corresponding to the entries -1 of y. Then do a summation of these 2 groups respectively, and do a subtraction.
One way to speed this up in MATLAB is to take advantage of the sparse structure of A (rather than the +1, -1 structure). You can realize huge speedups this way if A is sparse enough:
N = 1e4;
k = .0001;
% setup
x = rand(N, 1);
inds = find(rand(1,N^2) <= k);
values = 2*(randi(2, 1, length(inds))-1)-1;
% dense
fprintf('dense\n')
A = zeros(N,N);
A(inds) = values;
for i = 1:5
tic;
y = A*x;
toc;
end
% sparse
fprintf('sparse\n')
A = sparse(N,N);
A(inds) = values;
for i = 1:5
tic;
y = A*x;
toc;
end
I am trying to use meshgrid in Matlab together with Chebfun to get rid of double for loops. I first define a quasi-matrix of N functions,
%Define functions of type Chebfun
N = 10; %number of functions
x = chebfun('x', [0 8]); %Domain
psi = [];
for i = 1:N
psi = [psi sin(i.*pi.*x./8)];
end
A sample calculation would be to compute the double sum $\sum_{i,j=1}^10 psi(:,i).*psi(:,j)$. I can achieve this using two for loops in Matlab,
h = 0;
for i = 1:N
for j = 1:N
h = h + psi(:,i).*psi(:,j);
end
end
I then tried to use meshgrid to vectorize in the following way:
[i j] = meshgrid(1:N,1:N);
h = psi(:,i).*psi(:,j);
I get the error "Column index must be a vector of integers". How can I overcome this issue so that I can get rid of my double for loops and make my code a bit more efficient?
BTW, Chebfun is not part of native MATLAB and you have to download it in order to run your code: http://www.chebfun.org/. However, that shouldn't affect how I answer your question.
Basically, psi is a N column matrix and it is your desire to add up products of all combinations of pairs of columns in psi. You have the right idea with meshgrid, but what you should do instead is unroll the 2D matrix of coordinates for both i and j so that they're single vectors. You'd then use this and create two N^2 column matrices that is in such a way where each column corresponds to that exact column numbers specified from i and j sampled from psi. You'd then do an element-wise multiplication between these two matrices and sum across all of the columns for each row. BTW, I'm going to use ii and jj as variables from the output of meshgrid instead of i and j. Those variables are reserved for the complex number in MATLAB and I don't want to overshadow those unintentionally.
Something like this:
%// Your code
N = 10; %number of functions
x = chebfun('x', [0 8]); %Domain
psi = [];
for i = 1:N
psi = [psi sin(i.*pi.*x./8)];
end
%// New code
[ii,jj] = meshgrid(1:N, 1:N);
%// Create two matrices and sum
matrixA = psi(:, ii(:));
matrixB = psi(:, jj(:));
h = sum(matrixA.*matrixB, 2);
If you want to do away with the temporary variables, you can do it in one statement after calling meshgrid:
h = sum(psi(:, ii(:)).*psi(:, jj(:)), 2);
I don't have Chebfun installed, but we can verify that this calculates what we need with a simple example:
rng(123);
N = 10;
psi = randi(20, N, N);
Running this code with the above more efficient solution gives us:
>> h
h =
8100
17161
10816
12100
14641
9216
10000
8649
9025
11664
Also, running the above double for loop code also gives us:
>> h
h =
8100
17161
10816
12100
14641
9216
10000
8649
9025
11664
If you want to be absolutely sure, we can have both codes run with the outputs as separate variables, then check if they're equal:
%// Setup
rng(123);
N = 10;
psi = randi(20, N, N);
%// Old code
h = 0;
for i = 1:N
for j = 1:N
h = h + psi(:,i).*psi(:,j);
end
end
%// New code
[ii,jj] = meshgrid(1:N, 1:N);
hnew = sum(psi(:, ii(:)).*psi(:, jj(:)), 2);
%// Check for equality
eql = isequal(h, hnew);
eql checks if both variables are equal, and we do get them as such:
>> eql
eql =
1
I am aiming to minimize the below cost function over W
J = (E)^2
E = A - W .* B
Such that W(n+1) = W(n) - (u/2) * delJ
delJ = gradient of J = -2 * E .* B
u = step_size=0.2
where:
- A, B are STFT matrix of 2 audio signals (dimension is 257x4000 for a 16s audio with window size = 256 , 75% overlap, nfft=512)
- W is a matrix constructed with [257x1] vector repeated 4000 times (so that it become 257x4000] matrix
I have written my customize function as below. The problem is, the elements in A and B are so small (~e-20) that even after 1000 iteration that no change happens to g.
I am certainly missing something, if anybody can help or guide me to some link which explains the whole process for a new person.
[M,N] = size(A);
E =#(x) A - repmat(x,1,N) .* B; % Error Function
J = #(x) E(x) .^ 2; % Cost Function
G = #(x) -2 * E(x) .* B; % Gradiant Function
alpha = .2; % Step size
maxiter = 500; % Max iteration
dwmin = 1e-6; % Min change in gradiation
tolerence = 1e-6; % Max Gradiant norm
gnorm = inf;
w = rand(M,1);
dw = inf;
for i = 1:maxiter
g = G(w);
gnorm = norm(g);
wnew = w - (alpha/2)*g(:,1);
dw = norm(wnew-w)
if or(dw < dwmin, gnorm < tolerence)
break
end
end
w = wnew;
A & B are always positive real numbered vectors.
Your problem is actually a series of independent problems. If we index each row of A and B and each element of w with i, then minimizing the sum of squares across the error matrix
A - repmat(w, 1, N) .* B
is the same as minimizing the the sum of squares across the error vector
A(i, :) - w(i) * B(i, :)
for all rows separately. The latter problem can be solved using one of the least-squares operators of Matlab, in particular mrdivide or /:
for i = 1 : M
w(i) = A(i, :) / B(i, :);
end
As far as I can tell, there is no way to further vectorize this calculation.
In any case, there is no need to use a gradient descent or other form of optimization algorithm.
Following is the octave codes(part of kmeans)
centroidSum = zeros(K);
valueSum = zeros(K, n);
for i = 1 : m
for j = 1 : K
if(idx(i) == j)
centroidSum(j) = centroidSum(j) + 1;
valueSum(j, :) = valueSum(j, :) + X(i, :);
end
end
end
The codes work, is it possible to vectorize the codes?
It is easy to vectorize the codes without if statement,
but how could we vectorize the codes with if statement?
I assume the purpose of the code is to compute the centroids of subsets of a set of m data points in an n-dimensional space, where the points are stored in a matrix X (points x coordinates) and the vector idx specifies for each data point the subset (1 ... K) the point belongs to. Then a partial vectorization is:
centroid = zeros(K, n)
for j = 1 : K
centroid(j, :) = mean(X(idx == j, :));
end
The if is eliminated by indexing, in particular logical indexing: idx == j gives a boolean array which indicates which data points belong to subset j.
I think it might be possible to get rid of the second for-loop, too, but this would result in very convoluted, unintelligible code.
Brief introduction and solution code
This could be one fully vectorized approach based on -
accumarray: For accumulating summations as done for calulating valueSum. This also introduces a technique how one can use accumarray on a 2D matrix along a certain direction, which isn't possible in a straight-forward manner with it.
bsxfun: For calculating linear indices across all columns for matching row indices from idx.
Here's the implementation -
%// Store no. of columns in X for frequent usage later on
ncols = size(X,2);
%// Find indices in idx that are within [1:k] range, call them as labels
%// Also, find their locations in that range array, call those as pos
[pos,id] = ismember(idx,1:K);
labels = id(pos);
%// OR with bsxfun: [pos,labels] = find(bsxfun(#eq,idx(:),1:K));
%// Find all labels, i.e. across all columns of X
all_labels = bsxfun(#plus,labels(:),[0:ncols-1]*K);
%// Get truncated X corresponding to all indices matches across all columns
X_cut = X(pos,:);
%// Accumulate summations within each column based on the labels.
%// Note that accumarray doesn't accept matrices, so we were required
%// to create all_labels that had same labels within each column and
%// offsetted at constant intervals from consecutive columns
acc1 = accumarray(all_labels(:),X_cut(:));
%// Regularise accumulated array and reshape back to a 2D array version
acc1_reg2D = [acc1 ; zeros(K*ncols - numel(acc1),1)];
valueSum = reshape(acc1_reg2D,[],ncols);
centroidSum = histc(labels,1:K); %// Get labels counts as centroid sums
Benchmarking code
%// Datasize parameters
K = 5000;
n = 5000;
m = 5000;
idx = randi(9,1,m);
X = rand(m,n);
disp('----------------------------- With Original Approach')
tic
centroidSum1 = zeros(K,1);
valueSum1 = zeros(K, n);
for i = 1 : m
for j = 1 : K
if(idx(i) == j)
centroidSum1(j) = centroidSum1(j) + 1;
valueSum1(j, :) = valueSum1(j, :) + X(i, :);
end
end
end
toc, clear valueSum1 centroidSum1
disp('----------------------------- With Proposed Approach')
tic
%// ... Code from earlied mentioned section
toc
Runtime results
----------------------------- With Original Approach
Elapsed time is 1.235412 seconds.
----------------------------- With Proposed Approach
Elapsed time is 0.379133 seconds.
Not sure about its runtime performance but here's a non-convoluted vectorized implementation:
b = idx == 1:K;
centroids = (b' * X) ./ sum(b)';
Vectorizing the calculation makes a huge difference in performance. Benchmarking
The original code,
The partial vectorization from A. Donda and
The full vectorization from Tom,
gave me the following results:
Original Code: Elapsed time is 1.327877 seconds.
Partial Vectorization: Elapsed time is 0.630767 seconds.
Full Vectorization: Elapsed time is 0.021129 seconds.
Benchmarking code here:
%// Datasize parameters
K = 5000;
n = 5000;
m = 5000;
idx = randi(9,1,m);
X = rand(m,n);
fprintf('\nOriginal Code: ')
tic
centroidSum1 = zeros(K,1);
valueSum1 = zeros(K, n);
for i = 1 : m
for j = 1 : K
if(idx(i) == j)
centroidSum1(j) = centroidSum1(j) + 1;
valueSum1(j, :) = valueSum1(j, :) + X(i, :);
end
end
end
centroids = valueSum1 ./ centroidSum1;
toc, clear valueSum1 centroidSum1 centroids
fprintf('\nPartial Vectorization: ')
tic
centroids = zeros(K,n);
for k = 1:K
centroids(k,:) = mean( X(idx == k, :) );
end
toc, clear centroids
fprintf('\nFull Vectorization: ')
tic
centroids = zeros(K,n);
b = idx == 1:K;
centroids = (b * X) ./ sum(b)';
toc
Note, I added an extra line to the original code to element-wise divide valueSum1 by centroidSum1 to make the output of each type of code the same.
Finally, I know this isn't strictly an "answer", however I don't have enough reputation to add a comment, and I thought the benchmarking figures were useful to anyone who is learning MATLAB (like myself) and needs some extra motivation to master vectorization.
I have a randomly generated vector, say A of length M.
Say:
A = rand(M,1)
And I also have a function, X(k) = sin(2*pi*k).
How would I find Y(k) which is summation of A(l)*X(k-l) as l goes from 0 to M?
Assume any value of k... But the answer should be summation of all M+1 terms.
Given M and k, this is how you can perform your summation:
A = rand(M+1,1); %# Create M+1 random values
Y = sin(2*pi*(k-(0:M)))*A; %# Use a matrix multiply to perform the summation
EDIT: You could even create a function for Y that takes k and A as arguments:
Y = #(k,A) sin(2*pi*(k+1-(1:numel(A))))*A; %# An anonymous function
result = Y(k,A); %# Call the function Y
A = rand(1, M + 1);
X = sin(2 * pi * (k - (0 : M)));
total = sum(A .* X);
You may be looking for the convolution of the vectors A and X:
Y = conv(A, X);
If you're trying to do a fourier transform, you should also look at fft.