I have a big matrix M (nxm). I am going to sum some elements which are specified by index stored in vector as cell elements. There are many groups of indices so the cell has more than one element. For example
M = rand(2103, 2030);
index{1} = [1 3 2 4 53 5 23 3];
index{2} = [2 3 1 3 23 10234 2032];
% ...
index{2032} = ...;
I am going to sum up all elements at index{1}, sum up all elements at index{2} ..., now I am using a loop
sums = zeros(1, 2032);
for n=1:2032
sums(n) = sum(M(index{n}));
end
I am wondering if there is any way to use one-line command instead of a loop to do that. Using a loop is pretty slow.
Probably a classic use of cellfun
sums = cellfun(#(idx) sum(M(idx)), index);
EDIT: here is a benchmarking for a large case that shows that this approach is slightly slower than a for loop but faster than Eitan T's method
M = rand(2103, 2030);
index = cell(1, 2032);
index{1} = [1 3 2 4 53 5 23 3];
index{2} = [2 3 1 3 23 10234 2032];
for n=3:2032
index{n} = randi(numel(M), 1, randi(10000));
end
N = 1e1;
sums = zeros(1, 2032);
tic
for kk = 1:N
for n=1:2032
sums(n) = sum(M(index{n}));
end
end
toc
tic
for kk = 1:N
sums = cellfun(#(idx) sum(M(idx)), index);
end
toc
tic
for kk = 1:N
sums = cumsum(M([index{:}]));
sums = diff([0, sums(cumsum(cellfun('length', index)))]);
end
toc
results in
Elapsed time is 2.072292 seconds.
Elapsed time is 2.139882 seconds.
Elapsed time is 2.669894 seconds.
Perhaps not as elegant as a cellfun one-liner, but runs more than an order of magnitude faster:
sums = cumsum(M([index{:}]));
sums = diff([0, sums(cumsum(cellfun('length', index)))]);
It even runs approximately 4 or 5 times faster than a JIT-accelerated loop for large inputs. Note that when each cell in index contains a vector with more than ~2000 elements, the performance of this approach begins to deteriorate in comparison with a loop (and cellfun).
Benchmark
M = rand(2103, 2030);
I = ceil(numel(M) * rand(2032, 10));
index = mat2cell(I, ones(size(I, 1), 1), size(I, 2));
N = 100;
tic
for k = 1:N
sums = zeros(1, numel(index));
for n = 1:numel(sums)
sums(n) = sum(M(index{n}));
end
end
toc
tic
for k = 1:N
sums = cellfun(#(idx) sum(M(idx)), index);
end
toc
tic
for k = 1:N
sums = cumsum(M([index{:}]));
sums2 = diff([0, sums(cumsum(cellfun('length', index)))]);
end
toc
When executing this in MATLAB 2012a (Windows Server 2008 R2 running on a 2.27GHz 16-core Intel Xeon processor), I got:
Elapsed time is 0.579783 seconds.
Elapsed time is 1.789809 seconds.
Elapsed time is 0.111455 seconds.
Related
Suppose that c is a scalar value, T and W are M-by-N matrices, k is another M-by-N matrix containing values from 1 to M (and there are at least two pairs (i1, j1), (i2, j2) such that k(i1, j1)==k(i2, j2)) and a is a 1-by-M vector. I want to vectorize the following code (hoping that this will speed it up):
T = zeros(M,N);
for j = 1:N
for i = 1:M
T(k(i,j),j) = T(k(i,j),j) + c*W(i,j)/a(i);
end
end
Do you have any tips so that I can vectorize this code (or make it faster in general)?
Thanks in advance!
Since k only ever effects how values are aggregated within a column, but not between columns, you can achieve a slight speedup by reducing the problem to a single loop over columns and using accumarray like so:
T = zeros(M, N);
for col = 1:N
T(:, col) = accumarray(k(:,col), c*W(:, col)./a, [M 1]);
end
I tested each of the solutions (the loop in your question, rahnema's, Divakar's, and mine) by taking the average of 100 iterations using input values initialized as in Divakar's answer. Here's what I got (running Windows 7 x64, 16 GB RAM, MATLAB R2016b):
solution | avg. time (s) | max(abs(err))
---------+---------------+---------------
loop | 0.12461 | 0
rahnema | 0.84518 | 0
divakar | 0.12381 | 1.819e-12
gnovice | 0.09477 | 0
The take-away: loops actually aren't so bad, but if you can simplify them into one it can save you a little time.
Here's an approach with a combination of bsxfun and accumarray -
% Create 2D array of unique IDs along each col to be used as flattened subs
id = bsxfun(#plus,k,M*(0:N-1));
% Compute "c*W(i,j)/a(i)" for all i's and j's
cWa = c*bsxfun(#rdivide,W,a);
% Accumulate final result for all cols
out = reshape(accumarray(id(:),reshape(cWa,[],1),[M*N 1]),[M,N]);
Benchmarking
Approaches as functions -
function out = func1(W,a,c,k,M,N)
id = bsxfun(#plus,k,M*(0:N-1));
cWa = c*bsxfun(#rdivide,W,a);
out = reshape(accumarray(id(:),reshape(cWa,[],1),[M*N 1]),[M,N]);
function T = func2(W,a,c,k,M,N) % #rahnema1's solution
[I J] = meshgrid(1:M,1:N);
idx1 = sub2ind([M N], I ,J);
R = c.* W(idx1) ./ a(I);
T = accumarray([k(idx1(:)) ,J(:)], R(:),[M N]);
function T = func3(W,a,c,k,M,N) % Original approach
T = zeros(M,N);
for j = 1:N
for i = 1:M
T(k(i,j),j) = T(k(i,j),j) + c*W(i,j)/a(i);
end
end
function T = func4(W,a,c,k,M,N) % #gnovice's solution
T = zeros(M, N);
for col = 1:N
T(:, col) = accumarray(k(:,col), c*W(:, col)./a, [M 1]);
end
Machine setup : Kubuntu 16.04, MATLAB 2012a, 4GB RAM.
Timing code -
% Setup inputs
M = 3000;
N = 3000;
W = rand(M,N);
a = rand(M,1);
c = 2.34;
k = randi([1,M],[M,N]);
disp('------------------ With func1')
tic,out = func1(W,a,c,k,M,N);toc
clear out
disp('------------------ With func2')
tic,out = func2(W,a,c,k,M,N);toc
clear out
disp('------------------ With func3')
tic,out = func3(W,a,c,k,M,N);toc
clear out
disp('------------------ With func4')
tic,out = func4(W,a,c,k,M,N);toc
Timing code run -
------------------ With func1
Elapsed time is 0.215591 seconds.
------------------ With func2
Elapsed time is 1.555373 seconds.
------------------ With func3
Elapsed time is 0.572668 seconds.
------------------ With func4
Elapsed time is 0.291552 seconds.
Possible improvements in proposed approach
1] In c*bsxfun(#rdivide,W,a), we are use two stages of broadcasting - One at bsxfun(#rdivide,W,a), where a is broadcasted ; Second one when c is broadcasted to match-up against the 2D output of bsxfun(#rdivide,W,a), though we don't need bsxfun for this one. So, a possible improvement would be if we insert-in c to be divided by a, where c would be only broadcasted to 1D, instead of 2D and then the second level of broadcasting would be1D: c/a to 2D : W just like before. This minor improvement could be timed -
>> tic, c*bsxfun(#rdivide,W,a); toc
Elapsed time is 0.073244 seconds.
>> tic, bsxfun(#times,W,c/a); toc
Elapsed time is 0.041745 seconds.
But, in cases where c and a differ by a lot, the scaling factor c/a would affect the final result by appreciably. So, one need to be careful with this suggestion.
A possible solution:
[I J] = meshgrid(1:M,1:N);
idx1 = sub2ind([M N], I ,J);
R = c.* W(idx1) ./ a(I);
T = accumarray([K(idx1(:)) ,J(:)], R(:),[M N]);
Comparison of different methods in Octave without jit:
------------------ Divakar
Elapsed time is 0.282008 seconds.
------------------ rahnema1
Elapsed time is 1.08827 seconds.
------------------ gnovice
Elapsed time is 0.418701 seconds.
------------------ loop
doesn't complete in 15 seconds.
Now I have a matrix A of dimension N by p, and the other matrix B of dimension N by q. What I want to have is a matrix, say C, of dimension N by pq such that
C(i,:) = kron(A(i,:), B(i,:));
If N is large, loop over N rows may take quite long time. So currently I am augmenting A and B appropriately(combining usage of repmat, permute and reshape) to turn each matrix of dimension N by pq, and then formulating C by something like
C = A_aug .* B_aug;
Any better idea?
Checkout some bsxfun + permute + reshape magic -
out = reshape(bsxfun(#times,permute(A,[1 3 2]),B),size(A,1),[])
Benchmarking & Verification
Benchmarking code -
%// Setup inputs
N = 200;
p = 190;
q = 180;
A = rand(N,p);
B = rand(N,q);
disp('--------------------------------------- Without magic')
tic
C = zeros(size(A,1),size(A,2)*size(B,2));
for i = 1:size(A,1)
C(i,:) = kron(A(i,:), B(i,:));
end
toc
disp('--------------------------------------- With some magic')
tic
out = reshape(bsxfun(#times,permute(A,[1 3 2]),B),size(A,1),[]);
toc
error_val = max(abs(C(:)-out(:)))
Output -
--------------------------------------- Without magic
Elapsed time is 0.524396 seconds.
--------------------------------------- With some magic
Elapsed time is 0.055082 seconds.
error_val =
0
I am very new in Matlab. I just try to implement sum of series 1+x+x^2/2!+x^3/3!..... . But I could not find out how to do it. So far I did just sum of numbers. Help please.
for ii = 1:length(a)
sum_a = sum_a + a(ii)
sum_a
end
n = 0 : 10; % elements of the series
x = 2; % value of x
s = sum(x .^ n ./ factorial(n)); % sum
The second part of your answer is:
n = 0:input('variable?')
Cheery's approach is perfectly valid when the number of terms of the series is small. For large values, a faster approach is as follows. This is more efficient because it avoids repeating multiplications:
m = 10;
x = 2;
result = 1+sum(cumprod(x./[1:m]));
Example running time for m = 1000; x = 1;
tic
for k = 1:1e4
result = 1+sum(cumprod(x./[1:m]));
end
toc
tic
for k = 1:1e4
result = sum(x.^(0:m)./factorial(0:m));
end
toc
gives
Elapsed time is 1.572464 seconds.
Elapsed time is 2.999566 seconds.
Given the matrix:
a =
1 1 2 2
1 1 2 2
3 3 4 4
3 3 4 4
I would like to get the following four 2x2 matrices:
a1 =
1 1
1 1
a2 =
2 2
2 2
a3 =
3 3
3 3
a4 =
4 4
4 4
From there, I would like to take the max of each matrix and then reshape the result into a 2x2 result matrix, like so:
r =
1 2
3 4
The location of the result max values relative to their original position in the initial matrix is important.
Currently, I'm using the following code to accomplish this:
w = 2
S = zeros(size(A, 1)/w);
for i = 1:size(S)
for j = 1:size(S)
Window = A(i*w-1:i*w, j*w-1:j*w);
S(i, j) = max(max(Window));
end
end
This works but it seems like there must be a way that doesn't involve iteration (vectorization).
I tried using reshape like so:
reshape(max(max(reshape(A, w, w, []))), w, w, [])
however that takes the max of the wrong values and returns:
ans =
3 4
3 4
Is there any way to accomplish this without iteration or otherwise improve my iterative method?
UPDATE: I'm not sure how I've ended up with the most votes (as of 2012-10-28). For anyone reading this, please see angainor's or Rody's answers for better solutions that don't require any additional toolboxes.
Here is a horse race of every answer thus far (excluding Nates - sorry, don't have the requisite toolbox):
Z = 1000;
A = [1 1 2 2; 1 1 2 2; 3 3 4 4; 3 3 4 4];
w = 2;
%Method 1 (OP method)
tic
for z = 1:Z
S = zeros(size(A, 1)/w);
for i = 1:size(S)
for j = 1:size(S)
Window = A(i*w-1:i*w, j*w-1:j*w);
S(i, j) = max(max(Window));
end
end
end
toc
%Method 2 (My double loop with improved indexing)
tic
for z = 1:Z
wm = w - 1;
Soln2 = NaN(w, w);
for m = 1:w:size(A, 2)
for n = 1:w:size(A, 1)
Soln2((m+1)/2, (n+1)/2) = max(max(A(n:n+wm, m:m+wm)));
end
end
Soln2 = Soln2';
end
toc
%Method 3 (My one line method)
tic
for z = 1:Z
Soln = cell2mat(cellfun(#max, cellfun(#max, mat2cell(A, [w w], [w w]), 'UniformOutput', false), 'UniformOutput', false));
end
toc
%Method 4 (Rody's method)
tic
for z = 1:Z
b = [A(1:2,:) A(3:4,:)];
reshape(max(reshape(b, 4,[])), 2,2);
end
toc
The results of the speed test (the loop over z) are:
Elapsed time is 0.042246 seconds.
Elapsed time is 0.019071 seconds.
Elapsed time is 0.165239 seconds.
Elapsed time is 0.011743 seconds.
Drat! It appears that Rody (+1) is the winner. :-)
UPDATE: New entrant to the race angainor (+1) takes the lead!
Not very general, but it works for a:
b = [a(1:2,:) a(3:4,:)];
reshape(max(reshape(b, 4,[])), 2,2).'
The general version of this is a bit *ahum* fuglier:
% window size
W = [2 2];
% number of blocks (rows, cols)
nW = size(a)./W;
% indices to first block
ids = bsxfun(#plus, (1:W(1)).', (0:W(2)-1)*size(a,1));
% indices to all blocks in first block-column
ids = bsxfun(#plus, ids(:), (0:nW(1)-1)*W(1));
% indices to all blocks
ids = reshape(bsxfun(#plus, ids(:), 0:nW(1)*prod(W):numel(a)-1), size(ids,1),[]);
% maxima
M = reshape(max(a(ids)), nW)
It can be done a bit more elegantly:
b = kron(reshape(1:prod(nW), nW), ones(W));
C = arrayfun(#(x) find(b==x), 1:prod(nW), 'uni', false);
M = reshape(max(a([C{:}])), nW)
but I doubt that's gonna be faster...
Another option: slower than the cell2mat(cellfun...) code, but gives the intermediate step:
fun = #(block_struct) reshape((block_struct.data), [],1);
B = reshape(blockproc(A,[2 2],fun),2,2,[])
r=reshape(max(max(B)) ,2,[])
B(:,:,1) =
1 1
1 1
B(:,:,2) =
3 3
3 3
B(:,:,3) =
2 2
2 2
B(:,:,4) =
4 4
4 4
r =
1 2
3 4
I'll join the horse-race with another non-general (yet;) solution, based on linear indices
idx = [1 2 5 6; 3 4 7 8]';
splita = [A(idx) A(idx+8)];
reshape(max(splita), 2, 2);
The times obtained by Colins code, my method last:
Elapsed time is 0.039565 seconds.
Elapsed time is 0.021723 seconds.
Elapsed time is 0.168946 seconds.
Elapsed time is 0.011688 seconds.
Elapsed time is 0.006255 seconds.
The idx array can be easily generalized to larger windows and system sizes.
Note: Nate's solution uses the Image Processing Toolbox function |blockproc|. I would rewrite that:
fun = #(x) max(max(x.data));
r = blockproc(A,[2 2],fun)
Comparing timing across different computers is fraught with difficulties, as is timing things once that are happening in a fraction of a second. TIMEIT would be useful here:
http://www.mathworks.com/matlabcentral/fileexchange/18798
But timing this on my computer with tic/toc took 0.008 seconds.
Cheers,
Brett
I have a very large matrix (216 rows, 31286 cols) of doubles. For reasons specific to the data, I want to average every 9 rows to produce one new row. So, the new matrix will have 216/9=24 rows.
I am a Matlab beginner so I was wondering if this solution I came up with can be improved upon. Basically, it loops over every group, sums up the rows, and then divides the new row by 9. Here's a simplified version of what I wrote:
matrix_avg = []
for group = 1:216/9
new_row = zeros(1, 31286);
idx_low = (group - 1) * 9 + 1;
idx_high = idx_low + 9 - 1;
% Add the 9 rows to new_row
for j = idx_low:idx_high
new_row = new_row + M(j,:);
end
% Compute the mean
new_row = new_row ./ 9
matrix_avg = [matrix_avg; new_row];
end
You can reshape your big matrix from 216 x 31286 to 9 x (216/9 * 31286).
Then you can use mean, which operates on each column. Since your matrix only has 9 rows per column, this takes the 9-row average.
Then you can just reshape your matrix back.
% generate big matrix
M = rand([216 31286]);
n = 9 % want 9-row average.
% reshape
tmp = reshape(M, [n prod(size(M))/n]);
% mean column-wise (and only 9 rows per col)
tmp = mean(tmp);
% reshape back
matrix_avg = reshape(tmp, [ size(M,1)/n size(M,2) ]);
In a one-liner (but why would you?):
matrix_avg = reshape(mean(reshape(M,[n prod(size(M))/n])), [size(M,1)/n size(M,2)]);
Note - this will have problems if the number of rows in M isn't exactly divisible by 9, but so will your original code.
I measured the 4 solutions and here are the results:
reshape: Elapsed time is 0.017242 seconds.
blockproc [9 31286]: Elapsed time is 0.242044 seconds.
blockproc [9 1]: Elapsed time is 44.477094 seconds.
accumarray: Elapsed time is 103.274071 seconds.
This is the code I used:
M = rand(216,31286);
fprintf('reshape: ');
tic;
n = 9;
matrix_avg1 = reshape(mean(reshape(M,[n prod(size(M))/n])), [size(M,1)/n size(M,2)]);
toc
fprintf('blockproc [9 31286]: ');
tic;
fun = #(block_struct) mean(block_struct.data);
matrix_avg2 = blockproc(M,[9 31286],fun);
toc
fprintf('blockproc [9 1]: ');
tic;
fun = #(block_struct) mean(block_struct.data);
matrix_avg3 = blockproc(M,[9 1],fun);
toc
fprintf('accumarray: ');
tic;
[nR,nC] = size(M);
n2average = 9;
[xx,yy] = ndgrid(1:nR,1:nC);
x = ceil(xx/n2average); %# makes xx 1 1 1 1 2 2 2 2 etc
matrix_avg4 = accumarray([xx(:),yy(:)],M(:),[],#mean);
toc
Here's an alternative based on accumarray. You create an array with row and column indices into matrix_avg that tells you which element in matrix_avg a given element in M contributes to, then you use accumarray to average the elements that contribute to the same element in matrix_avg. This solution works even if the number of rows in M is not divisible by 9.
M = rand(216,31286);
[nR,nC] = size(M);
n2average = 9;
[xx,yy] = ndgrid(1:nR,1:nC);
x = ceil(xx/n2average); %# makes xx 1 1 1 1 2 2 2 2 etc
matrix_avg = accumarray([xx(:),yy(:)],M(:),[],#mean);