Matlab Vectorization - none-zero matrix row indices to cell - matlab

I am working with Matlab.
I have a binary square matrix. For each row, there is one or more entries of 1. I want to go through each row of this matrix and return the index of those 1s and store them in the entry of a cell.
I was wondering if there is a way to do this without looping over all the rows of this matrix, as for loop is really slow in Matlab.
For example, my matrix
M = 0 1 0
1 0 1
1 1 1
Then eventually, I want something like
A = [2]
[1,3]
[1,2,3]
So A is a cell.
Is there a way to achieve this goal without using for loop, with the aim of calculating the result more quickly?

At the bottom of this answer is some benchmarking code, since you clarified that you're interested in performance rather than arbitrarily avoiding for loops.
In fact, I think for loops are probably the most performant option here. Since the "new" (2015b) JIT engine was introduced (source) for loops are not inherently slow - in fact they are optimised internally.
You can see from the benchmark that the mat2cell option offered by ThomasIsCoding here is very slow...
If we get rid of that line to make the scale clearer, then my splitapply method is fairly slow, obchardon's accumarray option is a bit better, but the fastest (and comparable) options are either using arrayfun (as also suggested by Thomas) or a for loop. Note that arrayfun is basically a for loop in disguise for most use-cases, so this isn't a surprising tie!
I would recommend you use a for loop for increased code readability and the best performance.
Edit:
If we assume that looping is the fastest approach, we can make some optimisations around the find command.
Specifically
Make M logical. As the below plot shows, this can be faster for relatively small M, but slower with the trade-off of type conversion for large M.
Use a logical M to index an array 1:size(M,2) instead of using find. This avoids the slowest part of the loop (the find command) and outweighs the type conversion overhead, making it the quickest option.
Here is my recommendation for best performance:
function A = f_forlooplogicalindexing( M )
M = logical(M);
k = 1:size(M,2);
N = size(M,1);
A = cell(N,1);
for r = 1:N
A{r} = k(M(r,:));
end
end
I've added this to the benchmark below, here is the comparison of loop-style approaches:
Benchmarking code:
rng(904); % Gives OP example for randi([0,1],3)
p = 2:12;
T = NaN( numel(p), 7 );
for ii = p
N = 2^ii;
M = randi([0,1],N);
fprintf( 'N = 2^%.0f = %.0f\n', log2(N), N );
f1 = #()f_arrayfun( M );
f2 = #()f_mat2cell( M );
f3 = #()f_accumarray( M );
f4 = #()f_splitapply( M );
f5 = #()f_forloop( M );
f6 = #()f_forlooplogical( M );
f7 = #()f_forlooplogicalindexing( M );
T(ii, 1) = timeit( f1 );
T(ii, 2) = timeit( f2 );
T(ii, 3) = timeit( f3 );
T(ii, 4) = timeit( f4 );
T(ii, 5) = timeit( f5 );
T(ii, 6) = timeit( f6 );
T(ii, 7) = timeit( f7 );
end
plot( (2.^p).', T(2:end,:) );
legend( {'arrayfun','mat2cell','accumarray','splitapply','for loop',...
'for loop logical', 'for loop logical + indexing'} );
grid on;
xlabel( 'N, where M = random N*N matrix of 1 or 0' );
ylabel( 'Execution time (s)' );
disp( 'Done' );
function A = f_arrayfun( M )
A = arrayfun(#(r) find(M(r,:)),1:size(M,1),'UniformOutput',false);
end
function A = f_mat2cell( M )
[i,j] = find(M.');
A = mat2cell(i,arrayfun(#(r) sum(j==r),min(j):max(j)));
end
function A = f_accumarray( M )
[val,ind] = ind2sub(size(M),find(M.'));
A = accumarray(ind,val,[],#(x) {x});
end
function A = f_splitapply( M )
[r,c] = find(M);
A = splitapply( #(x) {x}, c, r );
end
function A = f_forloop( M )
N = size(M,1);
A = cell(N,1);
for r = 1:N
A{r} = find(M(r,:));
end
end
function A = f_forlooplogical( M )
M = logical(M);
N = size(M,1);
A = cell(N,1);
for r = 1:N
A{r} = find(M(r,:));
end
end
function A = f_forlooplogicalindexing( M )
M = logical(M);
k = 1:size(M,2);
N = size(M,1);
A = cell(N,1);
for r = 1:N
A{r} = k(M(r,:));
end
end

You can try arrayfun like below, which sweep through rows of M
A = arrayfun(#(r) find(M(r,:)),1:size(M,1),'UniformOutput',false)
A =
{
[1,1] = 2
[1,2] =
1 3
[1,3] =
1 2 3
}
or (a slower approach by mat2cell)
[i,j] = find(M.');
A = mat2cell(i,arrayfun(#(r) sum(j==r),min(j):max(j)))
A =
{
[1,1] = 2
[2,1] =
1
3
[3,1] =
1
2
3
}

Using accumarray:
M = [0 1 0
1 0 1
1 1 1];
[val,ind] = find(M.');
A = accumarray(ind,val,[],#(x) {x});

You can use strfind :
A = strfind(cellstr(char(M)), char(1));

Edit: I added a benchmark, the results show that a for loop is more efficient than accumarray.
You can usefind and accumarray:
[c, r] = find(A');
C = accumarray(r, c, [], #(v) {v'});
The matrix is transposed (A') because find groups by column.
Example:
A = [1 0 0 1 0
0 1 0 0 0
0 0 1 1 0
1 0 1 0 1];
% Find nonzero rows and colums
[c, r] = find(A');
% Group row indices for each columns
C = accumarray(r, c, [], #(v) {v'});
% Display cell array contents
celldisp(C)
Output:
C{1} =
1 4
C{2} =
2
C{3} =
3 4
C{4} =
1 3 5
Benchmark:
m = 10000;
n = 10000;
A = randi([0 1], m,n);
disp('accumarray:')
tic
[c, r] = find(A');
C = accumarray(r, c, [], #(v) {v'});
toc
disp(' ')
disp('For loop:')
tic
C = cell([size(A,1) 1]);
for i = 1:size(A,1)
C{i} = find(A(i,:));
end
toc
Result:
accumarray:
Elapsed time is 2.407773 seconds.
For loop:
Elapsed time is 1.671387 seconds.
A for loop is more efficient than accumarray...

Related

Smarter way to generate a matrix of zeros and ones in Matlab

I would like to generate all the possible adjacency matrices (zero diagonale) of an undirected graph of n nodes.
For example, with no relabeling for n=3 we get 23(3-1)/2 = 8 possible network configurations (or adjacency matrices).
One solution that works for n = 3 (and which I think is quite stupid) would be the following:
n = 3;
A = [];
for k = 0:1
for j = 0:1
for i = 0:1
m = [0 , i , j ; i , 0 , k ; j , k , 0 ];
A = [A, m];
end
end
end
Also I though of the following which seems to be faster but something is wrong with my indexing since 2 matrices are missing:
n = 3
C = [];
E = [];
A = zeros(n);
for i = 1:n
for j = i+1:n
A(i,j) = 1;
A(j,i) = 1;
C = [C,A];
end
end
B = ones(n);
B = B- diag(diag(ones(n)));
for i = 1:n
for j = i+1:n
B(i,j) = 0;
B(j,i) = 0;
E = [E,B];
end
end
D = [C,E]
Is there a faster way of doing this?
I would definitely generate the off-diagonal elements of the adjacency matrices with binary encoding:
n = 4; %// number of nodes
m = n*(n-1)/2;
offdiags = dec2bin(0:2^m-1,m)-48; %//every 2^m-1 possible configurations
If you have the Statistics and Machine Learning Toolbox, then squareform will easily create the matrices for you, one by one:
%// this is basically a for loop
tmpcell = arrayfun(#(k) squareform(offdiags(k,:)),1:size(offdiags,1),...
'uniformoutput',false);
A = cat(2,tmpcell{:}); %// concatenate the matrices in tmpcell
Although I'd consider concatenating along dimension 3, then you can see each matrix individually and conveniently.
Alternatively, you can do the array synthesis yourself in a vectorized way, it's probably even quicker (at the cost of more memory):
A = zeros(n,n,2^m);
%// lazy person's indexing scheme:
[ind_i,ind_j,ind_k] = meshgrid(1:n,1:n,1:2^m);
A(ind_i>ind_j) = offdiags.'; %'// watch out for the transpose
%// copy to upper diagonal:
A = A + permute(A,[2 1 3]); %// n x n x 2^m matrix
%// reshape to n*[] matrix if you wish
A = reshape(A,n,[]); %// n x (n*2^m) matrix

How to find neighbors in 4D array in MATLAB?

I am a bit confused and would greatly appreciate some help.
I have read many posts about finding neighboring pixels, with this being extremely helpful:
http://blogs.mathworks.com/steve/2008/02/25/neighbor-indexing-2/
However I have trouble applying it on a 4D matrix (A) with size(A)=[8 340 340 15]. It represents 8 groups of 3D images (15 slices each) of which I want to get the neighbors.
I am not sure which size to use in order to calculate the offsets. This is the code I tried, but I think it is not working because the offsets should be adapted for 4 dimensions? How can I do it without a loop?
%A is a 4D matrix with 0 or 1 values
Aidx = find(A);
% loop here?
[~,M,~,~] =size(A);
neighbor_offsets = [-1, M, 1, -M]';
neighbors_idx = bsxfun(#plus, Aidx', neighbor_offsets(:));
neighbors = B(neighbors_idx);
Thanks,
ziggy
Have you considered using convn?
msk = [0 1 0; 1 0 1; 0 1 0];
msk4d = permute( msk, [3 1 2 4] ); % make it 1-3-3-1 mask
neighbors_idx = find( convn( A, msk4d, 'same' ) > 0 );
You might find conndef useful for defining the basic msk in a general way.
Not sure if I've understood your question but what about this sort of approach:
if you matrix is 1D:
M = rand(10,1);
N = M(k-1:k+1); %//immediate neighbours of k
However this could error if k is at the boundary. This is easy to fix using max and min:
N = M(max(k-1,1):min(k+1,size(M,1))
Now lets add a dimenion:
M = rand(10,10);
N = M(max(k1-1,1):min(k1+1,size(M,1), max(k2-1,1):min(k2+1,size(M,2))
That was easy, all you had to do was repeat the same index making the minor change of using size(M,2) for the boundary (and also I changed k to k1 and k2, you might find using an array for k instead of separate k1 and k2 variables works better i.e. k(1) and k(2))
OK so now lets skip to 4 dimensions:
M = rand(10,10,10,10);
N = M(max(k(1)-1,1):min(k(1)+1,size(M,1)), ...
max(k(2)-1,1):min(k(2)+1,size(M,2)), ...
max(k(3)-1,1):min(k(3)+1,size(M,3)), ...
max(k(4)-1,1):min(k(4)+1,size(M,4))); %// Also you can replace all the `size(M,i)` with `end` if you like
I know you said you didn't want a loop, but what about a really short loop just to refactor a bit and also make it generalized:
n=ndims(M);
ind{n} = 0;
for dim = 1:n
ind{dim} = max(k(dim)-1,1):min(k(dim)+1,size(M,dim));
end
N = M(ind{:});
Here's how to get the neighbors along the second dimension
sz = size( A );
ndims = numel(sz); % number of dimensions
[d{1:ndims}] = ind2sub( sz, find( A ) );
alongD = 2; % work along this dim
np = d{alongD} + 1;
sel = np <= sz( alongD ); % discard neighbors that fall outside image boundary
nm = d{alongD} - 1;
sel = sel & nm > 0; % discard neighbors that fall outside image boundary
d = cellfun( #(x) x(sel), d, 'uni', 0 );
neighbors = cat( 1, ...
ind2sub( sz, d{1:alongD-1}, np(sel), d{alongD+1:end} ),...
ind2sub( sz, d{1:alongD-1}, nm(sel), d{alongD+1:end} ) );

Subtracting each elements of a row vector , size (1 x n) from a matrix of size (m x n)

I have two matrices of big sizes, which are something similar to the following matrices.
m; with size 1000 by 10
n; with size 1 by 10.
I would like to subtract each element of n from all elements of m to get ten different matrices, each has size of 1000 by 10.
I started as follows
clc;clear;
nrow = 10000;
ncol = 10;
t = length(n)
for i = 1:nrow;
for j = 1:ncol;
for t = 1:length(n);
m1(i,j) = m(i,j)-n(1);
m2(i,j) = m(i,j)-n(2);
m3(i,j) = m(i,j)-n(3);
m4(i,j) = m(i,j)-n(4);
m5(i,j) = m(i,j)-n(5);
m6(i,j) = m(i,j)-n(6);
m7(i,j) = m(i,j)-n(7);
m8(i,j) = m(i,j)-n(8);
m9(i,j) = m(i,j)-n(9);
m10(i,j) = m(i,j)-n(10);
end
end
end
can any one help me how can I do it without writing the ten equations inside the loop? Or can suggest me any convenient way especially when the two matrices has many columns.
Why can't you just do this:
m01 = m - n(1);
...
m10 = m - n(10);
What do you need the loop for?
Even better:
N = length(n);
m2 = cell(N, 1);
for k = 1:N
m2{k} = m - n(k);
end
Here we go loopless:
nrow = 10000;
ncol = 10;
%example data
m = ones(nrow,ncol);
n = 1:ncol;
M = repmat(m,1,1,ncol);
N = permute( repmat(n,nrow,1,ncol) , [1 3 2] );
result = bsxfun(#minus, M, N );
%or just
result = M-N;
Elapsed time is 0.018499 seconds.
or as recommended by Luis Mendo:
M = repmat(m,1,1,ncol);
result = bsxfun(#minus, m, permute(n, [1 3 2]) );
Elapsed time is 0.000094 seconds.
please make sure that your input vectors have the same orientation like in my example, otherwise you could get in trouble. You should be able to obtain that by transposements or you have to modify this line:
permute( repmat(n,nrow,1,ncol) , [1 3 2] )
according to your needs.
You mentioned in a comment that you want to count the negative elements in each of the obtained columns:
A = result; %backup results
A(A > 0) = 0; %set non-negative elements to zero
D = sum( logical(A),3 );
which will return the desired 10000x10 matrix with quantities of negative elements. (Please verify it, I may got a little confused with the dimensions ;))
Create the three dimensional result matrix. Store your results, for example, in third dimension.
clc;clear;
nrow = 10000;
ncol = 10;
N = length(n);
resultMatrix = zeros(nrow, ncol, N);
neg = zeros(ncol, N); % amount of negative values
for j = 1:ncol
for i = 1:nrow
for t = 1:N
resultMatrix(i,j,t) = m(i,j) - n(t);
end
end
for t = 1:N
neg(j,t) = length( find(resultMatrix(:,j,t) < 0) );
end
end

How to vectorize a sum of dot products between a normal and array of points

I need to evaluate following expression (in pseudo-math notation):
∑ipi⋅n
where p is a matrix of three-element vectors and n is a three-element vector. I can do this with for loops as follows but I can't figure out
how to vectorize this:
p = [1 1 1; 2 2 2];
n = [3 3 3];
s = 0;
for i = 1:size(p, 1)
s = s + dot(p(i, :), n)
end
Why complicate things? How about simple matrix multiplication:
s = sum(p * n(:))
where p is assumed to be an M-by-3 matrix.
I think you can do it with bsxfun:
sum(sum(bsxfun(#times,p,n)))
----------
% Is it the same for this case?
----------
n = 200; % depending on the computer it might be
m = 1000*n; % that n needs to be chosen differently
A = randn(n,m);
x = randn(n,1);
p = zeros(m,1);
q = zeros(1,m);
tic;
for i = 1:m
p(i) = sum(x.*A(:,i));
q(i) = sum(x.*A(:,i));
end
time = toc; disp(['time = ',num2str(time)]);

concatenation of N^2 3x3 matrixes into a 3Nx3N matrix

I have N^2 matrixes.
Each one is a 3x3 matrix.
One way to concatenation them to a 3Nx3N matrix is to write
A(:,:,i)= # 3x3 matrix i=1:N^2
B=[A11 A12 ..A1N;A21 ...A2N;...]
But When N is large is a tedious work.
What do you offer?
Here's a really fast one-liner that only uses RESHAPE and PERMUTE:
B = reshape(permute(reshape(A,3,3*N,N),[2 1 3]),3*N,3*N).';
And a test:
>> N=2;
>> A = rand(3,3,N^2)
A(:,:,1) =
0.5909 0.6571 0.8082
0.7118 0.6090 0.7183
0.4694 0.9588 0.5582
A(:,:,2) =
0.1791 0.6844 0.6286
0.4164 0.4140 0.5833
0.1380 0.1099 0.8970
A(:,:,3) =
0.2232 0.2355 0.1214
0.1782 0.6873 0.3394
0.5645 0.4745 0.9763
A(:,:,4) =
0.5334 0.7559 0.9984
0.8454 0.7618 0.1065
0.0549 0.5029 0.3226
>> B = reshape(permute(reshape(A,3,3*N,N),[2 1 3]),3*N,3*N).'
B =
0.5909 0.6571 0.8082 0.1791 0.6844 0.6286
0.7118 0.6090 0.7183 0.4164 0.4140 0.5833
0.4694 0.9588 0.5582 0.1380 0.1099 0.8970
0.2232 0.2355 0.1214 0.5334 0.7559 0.9984
0.1782 0.6873 0.3394 0.8454 0.7618 0.1065
0.5645 0.4745 0.9763 0.0549 0.5029 0.3226
Try the following code:
N = 4;
A = rand(3,3,N^2); %# 3-by-3-by-N^2
c1 = squeeze( num2cell(A,[1 2]) );
c2 = cell(N,1);
for i=0:N-1
c2{i+1} = cat(2, c1{i*N+1:(i+1)*N});
end
B = cat(1, c2{:}); %# 3N-by-3N
Another possibility involving mat2cell and reshape
N = 2;
A = rand(3,3,N^2);
C = mat2cell(A,3,3,ones(N^2,1));
C = reshape(C,N,N)'; %'# make a N-by-N cell array and transpose
%# catenate into 3N-by-3N cell array
B = cell2mat(C);
Here's the same in one line if you like that better
B = cell2mat(reshape(mat2cell(A,2,2,ones(N^2,1)),N,N)');
For N=2
>> A = rand(3,3,N^2)
A(:,:,1) =
0.40181 0.12332 0.41727
0.075967 0.18391 0.049654
0.23992 0.23995 0.90272
A(:,:,2) =
0.94479 0.33772 0.1112
0.49086 0.90005 0.78025
0.48925 0.36925 0.38974
A(:,:,3) =
0.24169 0.13197 0.57521
0.40391 0.94205 0.05978
0.096455 0.95613 0.23478
A(:,:,4) =
0.35316 0.043024 0.73172
0.82119 0.16899 0.64775
0.015403 0.64912 0.45092
B =
0.40181 0.12332 0.41727 0.94479 0.33772 0.1112
0.075967 0.18391 0.049654 0.49086 0.90005 0.78025
0.23992 0.23995 0.90272 0.48925 0.36925 0.38974
0.24169 0.13197 0.57521 0.35316 0.043024 0.73172
0.40391 0.94205 0.05978 0.82119 0.16899 0.64775
0.096455 0.95613 0.23478 0.015403 0.64912 0.45092
Why not do the old fashioned pre-allocate and loop? Should be pretty fast.
N = 4;
A = rand(3,3,N^2); % Assuming column major order for Aij
8
B = zeros(3*N, 3*N);
for j = 1:N^2
ix = mod(j-1, N)*3 + 1;
iy = floor((j-1)/N)*3 + 1;
fprintf('%02d - %02d\n', ix, iy);
B(ix:ix+2, iy:iy+2) = A(:,:,j);
end
EDIT: For the speed junkies out here are the rankings:
N = 200;
A = rand(3,3,N^2); % test set
#gnovice solution: Elapsed time is 0.013069 seconds.
#Amro solution: Elapsed time is 0.203308 seconds.
#Rich C solution: Elapsed time is 0.887077 seconds.
#Jonas solution: Elapsed time is 7.065174 seconds.