Given a J-by-2 matrix, for example
A = [1 2 ; 3 4 ; 5 6]
I want to block diagonalize it. That is, I want:
B = [1 2 0 0 0 0 ; 0 0 3 4 0 0 ; 0 0 0 0 5 6].
One command that does this is:
blkdiag(A(1,:),A(2,:),A(3,:))
This is will be slow and tedious if J is large. Is there a built-in Matlab function that does this?
Here's one hacky solution for J x 2 array case using linear indexing -
%// Get number of rows
N = size(A,1);
%// Get linear indices of the first column elements positions in output array
idx = 1:2*N+1:(N-1)*(2*N+1)+1;
%// Setup output array
out = zeros(N,N*2);
%// Put first and second column elements into idx and idx+N positions
out([idx(:) idx(:)+N]) = A
There is just one function call (neglecting size as it must be minimal) overhead of zeros and even that can be removed with this undocumented zeros initialization trick -
out(N,N*2) = 0; %// Instead of out = zeros(N,N*2);
Sample run -
A =
1 2
3 4
5 6
7 8
out =
1 2 0 0 0 0 0 0
0 0 3 4 0 0 0 0
0 0 0 0 5 6 0 0
0 0 0 0 0 0 7 8
Here's a benchmarking for the posted solutions thus far.
Benchmarking Code
%//Set up some random data
J = 7000; A = rand(J,2);
%// Warm up tic/toc
for k = 1:100000
tic(); elapsed = toc();
end
disp('---------------------------------- With #mikkola solution')
tic
temp = mat2cell(A, ones(J,1), 2);
B = blkdiag(temp{:});
toc, clear B temp
disp('---------------------------------- With #Jeff Irwin solution')
tic
m = size(A, 1);
n = size(A, 2);
B = zeros(m, m * n);
for k = 1: n
B(:, k: n: m * n) = diag(A(:, k));
end
toc, clear B k m n
disp('---------------------------------- With Hacky1 solution')
tic
N = size(A,1);
idx = 1:2*N+1:(N-1)*(2*N+1)+1;
out = zeros(N,N*2);
out([idx(:) idx(:)+N]) = A;
toc, clear out idx N
disp('---------------------------------- With Hacky2 solution')
tic
N = size(A,1);
idx = 1:2*N+1:(N-1)*(2*N+1)+1;
out(N,N*2) = 0;
out([idx(:) idx(:)+N]) = A;
toc, clear out idx N
Runtimes
---------------------------------- With #mikkola solution
Elapsed time is 0.546584 seconds.
---------------------------------- With #Jeff Irwin solution
Elapsed time is 1.330666 seconds.
---------------------------------- With Hacky1 solution
Elapsed time is 0.455735 seconds.
---------------------------------- With Hacky2 solution
Elapsed time is 0.364227 seconds.
Here's one using mat2cell to reshape into an J-by-1 cell array where each element contains a row of A. Then use the {:} operator to push the contents to blkdiag as a comma-separated variable list:
%//Set up some random data
J = 100;
A = rand(J,2);
%// Solution for arbitrary J-by-2 A
temp = mat2cell(A, ones(J,1), 2);
B = blkdiag(temp{:})
Nice solutions! I had a few timing results here, but repeated them running the benchmarking code by #Divakar. Results on my end below.
---------------------------------- With #mikkola solution
Elapsed time is 0.100674 seconds.
---------------------------------- With #Jeff Irwin solution
Elapsed time is 0.283275 seconds.
---------------------------------- With #Divakar Hacky1 solution
Elapsed time is 0.079194 seconds.
---------------------------------- With #Divakar Hacky2 solution
Elapsed time is 0.051629 seconds.
Here's another solution. Not sure how efficient it is, but it works for a matrix A of any size.
A = [1 2; 3 4; 5 6]
m = size(A, 1)
n = size(A, 2)
B = zeros(m, m * n)
for k = 1: n
B(:, k: n: m * n) = diag(A(:, k))
end
I found a way to do this using sparse that beats the other solutions both in memory and clock time:
N = size(A,1);
ind_1 = [1:N].';
ind_2 = [1:2:2*N-1].';
A_1 = sparse(ind_1,ind_2,A(:,1),N,2*N);
ind_2 = [2:2:2*N].';
A_2 = sparse(ind_1,ind_2,A(:,2),N,2*N);
out = A_1 + A_2;
Results are reported below using the same benchmarking code as #Divakar:
---------------------------------- With #mikkola solution
Elapsed time is 0.065136 seconds.
---------------------------------- With #Jeff Irwin solution
Elapsed time is 0.500264 seconds.
---------------------------------- With Hacky1 solution
Elapsed time is 0.200303 seconds.
---------------------------------- With Hacky2 solution
Elapsed time is 0.011991 seconds.
---------------------------------- With #Matt T solution
Elapsed time is 0.000712 seconds.
Related
How is it possible to create a sequence if I have vectors of starting and ending numbers of the subsequences in a vectorized way in Matlab?
Example Input:
A=[12 20 34]
B=[18 25 37]
I want to get (spacing for clarity):
C=[12 13 14 15 16 17 18 20 21 22 23 24 25 34 35 36 37]
Assuming that A and B are sorted ascending and that B(i) < A(i+1) holds then:
idx = zeros(1,max(B)+1);
idx(A) = 1;
idx(B+1) = -1;
C = find(cumsum(idx))
To get around the issue mentioned by Dennis in the comments:
m = min(A)-1;
A = A-m;
B = B-m;
idx = zeros(1,max(B)+1);
idx(A) = 1;
idx(B+1) = -1;
C = find(cumsum(idx)) + m;
cumsum based approach for a generic case (negative numbers or overlaps) -
%// Positions in intended output array at which group shifts
intv = cumsum([1 B-A+1])
%// Values to be put at those places with intention of doing cumsum at the end
put_vals = [A(1) A(2:end) - B(1:end-1)]
%// Get vector of ones and put_vals
id_arr = ones(1,intv(end)-1)
id_arr(intv(1:end-1)) = put_vals
%// Final output with cumsum of id_arr
out = cumsum(id_arr)
Sample run -
>> A,B
A =
-2 -3 1
B =
5 -1 3
>> out
out =
-2 -1 0 1 2 3 4 5 -3 -2 -1 1 2 3
Benchmarking
Here's a runtime test after warming-up tic-toc to compare the various approaches listed to solve the problem for large sized A and B -
%// Create inputs
A = round(linspace(1,400000000,200000));
B = round((A(1:end-1) + A(2:end))/2);
B = [B A(end)+B(1)];
disp('------------------ Divakar Method')
.... Proposed approach in this solution
disp('------------------ Dan Method')
tic
idx = zeros(1,max(B)+1);
idx(A) = 1;
idx(B+1) = -1;
C = find(cumsum(idx));
toc, clear C idx
disp('------------------ Santhan Method')
tic
In = [A;B];
difIn = diff(In);
out1 = bsxfun(#plus, (0:max(difIn)).',A); %//'
mask = bsxfun(#le, (1:max(difIn)+1).',difIn+1); %//'
out1 = out1(mask).'; %//'
toc, clear out1 mask difIn In
disp('------------------ Itamar Method')
tic
C = cell2mat(cellfun(#(a,b){a:b},num2cell(A),num2cell(B)));
toc, clear C
disp('------------------ dlavila Method')
tic
C = cell2mat(arrayfun(#(a,b)a:b, A, B, 'UniformOutput', false));
toc
Runtimes
------------------ Divakar Method
Elapsed time is 0.793758 seconds.
------------------ Dan Method
Elapsed time is 2.640529 seconds.
------------------ Santhan Method
Elapsed time is 1.662889 seconds.
------------------ Itamar Method
Elapsed time is 2.524527 seconds.
------------------ dlavila Method
Elapsed time is 2.096454 seconds.
Alternative using bsxfun
%// Find the difference between Inputs
difIn = B - A;
%// Do colon on A to A+max(difIn)
out = bsxfun(#plus, (0:max(difIn)).',A); %//'
%// mask out which values you want
mask = bsxfun(#le, (1:max(difIn)+1).',difIn+1); %//'
%// getting only the masked values
out = out(mask).'
Results:
>> A,B
A =
-2 -4 1
B =
5 -3 3
>> out
out =
-2 -1 0 1 2 3 4 5 -4 -3 1 2 3
This is one option:
C = cell2mat(cellfun(#(a,b){a:b},num2cell(A),num2cell(B)));
You can do this:
C = cell2mat(arrayfun(#(a,b)a:b, A, B, "UniformOutput", false))
arrayfun(...) create all the subsequences taking pairs of A and B.
cell2mat is used to concatenate each subsequence.
I work with an image that I consider as a matrix.
I want to turn a 800 x 800 matrix (A) into a 400 x 400 matrix (B) where the mean of 4 cells of the A matrix = 1 cell of the B matrix (I know this not a right code line) :
B[1,1] =mean2(A[1,1 + 1,2 + 2,1 + 2,2])
and so on for the whole matrix ...
B [1,2]=mean2(A[1,3 + 1,4 + 2,3 + 2,4 ])
I thought to :
1) Reshape the A matrix into a 2 x 320 000 matrix so I get the four cells I need to average next to each other and it is easier to deal with the row number afterwards.
Im4bis=reshape(permute(reshape(Im4,size(Im4,2),2,[]),[2,3,1]),2,[]);
2) Create a cell-array with the 4 cells I need to average (subsetted) and calculate the mean of it. That's where it doesn't work
I{1,160000}=ones,
for k=drange(1:2:319999)
for n=1:160000
I{n}=mean2(Im4bis(1:2,k:k+1));
end
end
I created an empty matrix of 400 x 400 cells (actually a vector of 1 x 160000) and I wanted to fill it with the mean but I get a matrix of 1 x 319 999 cells with one cell out of 2 empty.
Looking for light
My Input Image:
Method 1
Using mat2cell and cellfun
AC = mat2cell(A, repmat(2,size(A,1)/2,1), repmat(2,size(A,2)/2,1));
out = cellfun(#(x) mean(x(:)), AC);
Method 2
using im2col
out = reshape(mean(im2col(A,[2 2],'distinct')),size(A)./2);
Method 3
Using simple for loop
out(size(A,1)/2,size(A,2)/2) = 0;
k = 1;
for i = 1:2:size(A,1)
l = 1;
for j = 1:2:size(A,2)
out(k,l) = mean(mean(A(i:i+1,j:j+1)));
l = l+1;
end
k = k+1;
end
Test on input image:
A = rgb2gray(imread('inputImage.png'));
%// Here, You could use any of the method from any answers
%// or you could use the best method from the bench-marking tests done by Divakar
out = reshape(mean(im2col(A,[2 2],'distinct')),size(A)./2);
imshow(uint8(out));
imwrite(uint8(out),'outputImage.bmp');
Output Image:
Final check by reading the already written image
B = imread('outputImage.bmp');
>> whos B
Name Size Bytes Class Attributes
B 400x400 160000 uint8
Let A denote your matrix and
m = 2; %// block size: rows
n = 2; %// block size: columns
Method 1
Use blockproc:
B = blockproc(A, [m n], #(x) mean(x.data(:)));
Example:
>> A = magic(6)
A =
35 1 6 26 19 24
3 32 7 21 23 25
31 9 2 22 27 20
8 28 33 17 10 15
30 5 34 12 14 16
4 36 29 13 18 11
>> B = blockproc(A, [m n], #(x) mean(x.data(:)))
B =
17.7500 15.0000 22.7500
19.0000 18.5000 18.0000
18.7500 22.0000 14.7500
Method 2
If you prefer the reshaping way (which is probably faster), use this great answer to organize the matrix into 2x2 blocks tiled along the third dimension, average along the first two dimensions, and reshape the result:
T = permute(reshape(permute(reshape(A, size(A, 1), n, []), [2 1 3]), n, m, []), [2 1 3]);
B = reshape(mean(mean(T,1),2), size(A,1)/m, size(A,2)/n);
Method 3
Apply a 2D convolution (conv2) and then downsample. The convolution computes more entries than are really necessary (hence the downsampling), but on the other hand it can be done separably, which helps speed things up:
B = conv2(ones(m,1)/m, ones(1,n)/n ,A,'same');
B = B(m-1:m:end ,n-1:n:end);
One approach based on this solution using reshape, sum & squeeze -
sublen = 2; %// subset length
part1 = reshape(sum(reshape(A,sublen,[])),size(A,1)/sublen,sublen,[]);
out = squeeze(sum(part1,2))/sublen^2;
Benchmarking
Set #1
Here are the runtime comparisons for the approaches listed so far for a input datasize of 800x 800 -
%// Input
A = rand(800,800);
%// Warm up tic/toc.
for k = 1:50000
tic(); elapsed = toc();
end
disp('----------------------- With RESHAPE + SUM + SQUEEZE')
tic
sublen = 2; %// subset length
part1 = reshape(sum(reshape(A,sublen,[])),size(A,1)/sublen,sublen,[]);
out = squeeze(sum(part1,2))/sublen^2;
toc, clear sublen part1 out
disp('----------------------- With BLOCKPROC')
tic
B = blockproc(A, [2 2], #(x) mean(x.data(:))); %// [m n]
toc, clear B
disp('----------------------- With PERMUTE + MEAN + RESHAPE')
tic
m = 2;n = 2;
T = permute(reshape(permute(reshape(A, size(A, 1), n, []),...
[2 1 3]), n, m, []), [2 1 3]);
B = reshape(mean(mean(T,1),2), size(A,1)/m, size(A,2)/m);
toc, clear B T m n
disp('----------------------- With CONVOLUTION')
tic
m = 2;n = 2;
B = conv2(ones(m,1)/m, ones(1,n)/n ,A,'same');
B = B(m-1:m:end ,n-1:n:end);
toc, clear m n B
disp('----------------------- With MAT2CELL')
tic
AC = mat2cell(A, repmat(2,size(A,1)/2,1), repmat(2,size(A,2)/2,1));
out = cellfun(#(x) mean(x(:)), AC);
toc
disp('----------------------- With IM2COL')
tic
out = reshape(mean(im2col(A,[2 2],'distinct')),size(A)./2);
toc
Runtime results -
----------------------- With RESHAPE + SUM + SQUEEZE
Elapsed time is 0.004702 seconds.
----------------------- With BLOCKPROC
Elapsed time is 6.039851 seconds.
----------------------- With PERMUTE + MEAN + RESHAPE
Elapsed time is 0.006015 seconds.
----------------------- With CONVOLUTION
Elapsed time is 0.002174 seconds.
----------------------- With MAT2CELL
Elapsed time is 2.362291 seconds.
----------------------- With IM2COL
Elapsed time is 0.239218 seconds.
To make the runtimes more fair, we can use a number of trials of 1000 on top of the fastest three approaches for the same input datasize of 800 x 800, giving us -
----------------------- With RESHAPE + SUM + SQUEEZE
Elapsed time is 1.264722 seconds.
----------------------- With PERMUTE + MEAN + RESHAPE
Elapsed time is 3.986038 seconds.
----------------------- With CONVOLUTION
Elapsed time is 1.992030 seconds.
Set #2
Here are the runtime comparisons for a larger input datasize of 10000x 10000 for the fastest three approaches -
----------------------- With RESHAPE + SUM + SQUEEZE
Elapsed time is 0.158483 seconds.
----------------------- With PERMUTE + MEAN + RESHAPE
Elapsed time is 0.589322 seconds.
----------------------- With CONVOLUTION
Elapsed time is 0.307836 seconds.
I'm working on Matlab and was wondering how I add terms within a large matrix. Specifically, I have a 4914x4914 matrix and would like to create a 189x189 matrix, where each term is equal to the sum of the terms in each 26x26 subset.
To illustrate, say I had the magic 4x4 matrix as follows:
[16 2 3 13;
5 11 10 8;
9 7 6 12;
4 14 15 1]
and I wanted to create a 2x2 matrix equal to the sum of each 2x2 matrix within the original magic 4x4, i.e.:
[(16+2+5+11) (3+13+10+8);
(9+7+4+14) (6+12+15+1)]
Grateful for any advice!
Thanks
jake
Assuming A to be the input 4914x4914 matrix, this could be an efficient (in terms of runtime) approach -
sublen = 26; %// subset length
squeeze(sum(reshape(sum(reshape(A,sublen,[])),size(A,1)/sublen,sublen,[]),2))
For a generic block size, let's have a function -
function out = sum_blocks(A,block_nrows, block_ncols)
out = squeeze(sum(reshape(sum(reshape(A,block_nrows,[])),...
size(A,1)/block_nrows,block_ncols,[]),2));
return
Sample run -
>> A = randi(9,4,6);
>> A
A =
8 2 4 9 4 5
3 3 8 3 6 8
9 6 6 7 1 9
4 5 5 7 1 2
>> sum_blocks(A,2,3)
ans =
28 35
35 27
>> sum(sum(A(1:2,1:3)))
ans =
28
>> sum(sum(A(1:2,4:6)))
ans =
35
>> sum(sum(A(3:4,1:3)))
ans =
35
>> sum(sum(A(3:4,4:6)))
ans =
27
If you would like to avoid squeeze -
sum(permute(reshape(sum(reshape(A,sublen,[])),size(A,1)/sublen,sublen,[]),[1 3 2]),3)
Benchmarking
Hoping you would care about performance, here are the benchmark results for all the solutions posted here. The benchmarking code that I have used -
num_runs = 100; %// Number of iterations to run benchmarks
A = rand(4914);
for k = 1:50000
tic(); elapsed = toc(); %// Warm up tic/toc
end
disp('---------------------- With squeeze + reshape + sum')
tic
for iter = 1:num_runs
sublen = 26; %// subset length
out1 = squeeze(sum(reshape(sum(reshape(A,sublen,[])),...
size(A,1)/sublen,sublen,[]),2));
end
time1 = toc;
disp(['Avg. elapsed time = ' num2str(time1/num_runs) ' sec(s)']), clear out1 sublen
disp('---------------------- With kron + matrix multiplication')
tic
for iter = 1:num_runs
n = 189; k = 26;
B = kron(speye(k), ones(1,n));
result = B*A*B';
end
time2 = toc;
disp(['Avg. elapsed time = ' num2str(time2/num_runs) ' sec(s)']),clear result n k B
disp('---------------------- With accumarray')
tic
for iter = 1:num_runs
s = 26; n = size(A,1)/s;
subs = kron(reshape(1:(n^2), n, n),ones(s));
out2 = reshape(accumarray(subs(:), A(:)), n, n);
end
time2 = toc;
disp(['Avg. elapsed time = ' num2str(time2/num_runs) ' sec(s)']),clear s n subs out2
The benchmarks results I got on my system -
---------------------- With squeeze + reshape + sum
Avg. elapsed time = 0.050729 sec(s)
---------------------- With kron + matrix multiplication
Avg. elapsed time = 0.068293 sec(s)
---------------------- With accumarray
Avg. elapsed time = 0.64745 sec(s)
An alternative way is to reshape the whole matrix into a 4D matrix and sum the elements over first and third dimension:
result = squeeze(sum(sum(reshape(A,26,189,26,189),1),3));
If you don't have the image processing toolbox then you can do this using accumarray:
s = 26;
n = size(A,1)/s;
subs = kron(reshape(1:(n^2), n, n),ones(s));
reshape(accumarray(subs(:), A(:)), n, n)
this is reusable should you decide to aggregate some way other than a simple sum e.g. a median:
reshape(accumarray(subs(:), A(:), [], #median), n, n)
You can use matrix multiplication, of course:
n = 26;
k = 189;
B = kron(speye(k), ones(1,n));
result = B*A*B';
I have a matrix A , I want to remove the rows that has similar values (1,1), (2,2), (3,3)
A =
1 1
2 1
3 1
1 2
2 2
1 3
3 3
so the matrix would be like this
2 1
3 1
1 2
1 3
Another approach without calling any function:
A = A(A(:,1) == A(:,2),:)
Efficiency of this approach vs the solution based on diff():
n = 10;
y = [round(rand(n,1)) round(rand(n,1))];
tic;
for i = 1:1e4
A = y;
A(diff(A,[],2)~=0,:);
end
toc
Elapsed time is 0.091990 seconds.
tic;
for i = 1:1e4
A = y;
A = A(A(:,1) == A(:,2),:);
end
toc
Elapsed time is 0.037842 seconds.
% Suggestion of #Dan in the comments
tic;
for i = 1:1e4
A = y;
A(A(:,1) == A(:,2),:) = [];
end
toc
Elapsed time is 0.147636 seconds.
One approach using diff -
A(diff(A,[],2)~=0,:)
For a general NXM case, where M is the number of columns of A, one can extend this as -
A(any(diff(A,[],2)~=0,2),:)
Thus, if you have
A= [1 1 1;
2 2 3;
3 1 4;
8 1 2;
2 2 2;
1 3 1;
3 3 3]
You would get -
2 2 3
3 1 4
8 1 2
1 3 1
I have this sparse matrix of the following form
Lets take an example of 5x10 matrix
1 2 3 4 5 6 7 8 9 10
1 1 1 0 0 0 0 0 0 0 0
2 0 1 0 0 0 0 0 0 0 0
3 .............................
4 .............................
5 .............................
From this sparse matrix, I want to create a cell array C of form
C{1} 1
C{2} = [1,2]
...........
...........
...........
My sparse matrix is high dimensional like 40000 by 790000. How can I do it efficiently in matlab. I can definitely use a loop and do it inefficiently. But I want the most efficient. Suggestions?
Use find to get the indices and accumarray to group them by columns:
[ii, jj] = find(A);
C = accumarray(jj, ii, [], #(v) {v.'});
Benchmarking
%// Random sparse matrix. Code adapted from #teng's answer
sz = [4e4 79e4];
nz = 1e5; %// number of nonzeros
A = sparse(randi(sz(1),[nz 1]),randi(sz(2),[nz 1]),1,sz(1),sz(2));
tic;
[ii, jj] = find(A);
C = accumarray(jj, ii, [], #(v) {v.'});
toc
Results:
For nz = 1e4:
Elapsed time is 0.099657 seconds.
For nz = 1e5:
Elapsed time is 0.756234 seconds.
For nz = 1e6:
Elapsed time is 5.431427 seconds.
Let me get the party started...
let's start with the basics:
tic;
sz = [ 400 7900]; % hehe...
aMat = sparse(randi(sz(1),[1000 1]),randi(sz(2),[1000 1]),1,sz(1),sz(2));
aCell = mat2cell(aMat,ones([sz(1) 1]));
preC = cellfun(#(x) x(x~=0), aCell,'UniformOutput',false);
C = cellfun(#(x) find(x), preC,'UniformOutput',false);
toc