Efficient comparison of two matrices MATLAB - matlab

Is there a way to efficiently compare two matrices, I was thinking something like
same = abs((A-B)) = 0...
substracting values of one matrix to the other and if they result is 0, they are the same, also there is a isequal() function, What would be the best to compare both matrices?

You can simply do isequal(A,B) and it will return 1 if true or 0 if false.

Since you're dealing with floating point, you probably don't want to test for exact equality (depending on your application). Thus, you can just check that
norm(A - B)
is sufficiently small, say < 1e-9, again depending on your application. This is the matrix 2-norm, which will be near zero if A - B is the all zeros matrix or nearly so.

It seems that ISEQUAL is faster than checking for non-zero elements after subtraction:
>> a = rand(100, 100);
>> b = a;
>> tic; for ii = 1:100000; any(any(a - b)); end; toc;
Elapsed time is 2.089838 seconds.
>> tic; for ii = 1:100000; isequal(a, b); end; toc;
Elapsed time is 1.201815 seconds.

Related

Relational operators between big sparse matrices in Matlab

I have two big sparse double matrices in Matlab:
P with dimension 1048576 x 524288
I with dimension 1048576 x 524288
I want to find the number of entrances i,j such that P(i,j)<=I(i,j)
Naively I tried to run
n=sum(sum(P<=I));
but it is extremely slow (I had to shut down Matlab because it was running since forever and I wasn't able to stop it).
Is there any other more efficient way to proceed or what I want to do is unfeasible?
From some simple tests,
n = numel(P) - nnz(P>I);
seems to be faster than sum(sum(P<=I)) or even nnz(P<=I). The reason is probably that the sparse matrix P<=I has many more nonzero entries than P>I, and thus requires more memory.
Example:
>> P = sprand(10485, 52420, 1e-3);
>> I = sprand(10485, 52420, 1e-3);
>> tic, disp(sum(sum(P<=I))); toc
(1,1) 549074582
Elapsed time is 3.529121 seconds.
>> tic, disp(nnz(P<=I)); toc
549074582
Elapsed time is 3.538129 seconds.
>> tic, disp(nnz(P<=I)); toc
549074582
Elapsed time is 3.499927 seconds.
>> tic, disp(numel(P) - nnz(P>I)); toc
549074582
Elapsed time is 0.010624 seconds.
Of course this highly depends on the matrix sizes and density.
Here is a solution using indices of nonzero elements:
xp = find(P);
xi = find(I);
vp = nonzeros(P);
vi = nonzeros(I);
[s,ia,ib] = intersect(xp,xi);
iia = true(numel(vp),1);
iia(ia)=false;
iib = true(numel(vi),1);
iib(ib) = false;
n = sum(vp(ia) <= vi(ib))+sum(vp(iia)<0)+sum(vi(iib)>0)-(numel(xp)+numel(xi)-numel(s))+numel(P);

Calculate a function over all permutations of columns

I have this code:
abs(mean(exp(1i*( a(:,1) - a(:,2) ))))
where a is a 550-by-129 double matrix. How can I write code using that code to replace a(:,1) with a(:,2) and then a(:,3) and so on because I need each column to subtract from every other column?
Another method using matrix multiplication:
E = exp(1i*a);
result = abs(E.'*(1./E)/size(E,1));
Explanation:
You can rewrite the expression
exp(1i*( a - b) ))
as
exp(1i*a)/exp(1i*b)
so
exp(1i*a)*(1/exp(1i*b))
and mean(x) is sum(x)/n
Using that you can do your task using very fast matrix multiplication.
Result of a comparison between different methods in Octave:
Matrix Multiplication:
Elapsed time is 0.0133181 seconds.
BSXFUN:
Elapsed time is 1.33882 seconds.
REPMAT:
Elapsed time is 1.43535 seconds.
FOR LOOP:
Elapsed time is 3.10798 seconds.
Here is the code for comparing different methods.
Looped, this is an easy trick; let an outer loop run over all indices, and an inner loop as well.
a = rand(550,129);
out = zeros(size(a,2),size(a,2));
for ii = 1:size(a,2)
for jj = 1:size(a,2)
out(ii,jj) = abs(mean(exp(1i*(a(:,ii)-a(:,jj)))));
end
end
No loops, one line:
result = permute(abs(mean(exp(1i*bsxfun(#minus, a, permute(a, [1 3 2]))),1)), [2 3 1]);
This computes all pairs of row differences as a 3D array, where the second and third dimensions refer to the two row indices in the original 2D arrays; then applies the required operations along the first dimension; and finally permutes the dimensions to yield a 2D array result.
a bit off-topic, but you can do that with indexing too
a = rand(550,129);
c = repmat(1:size(a,2),1,size(a,2));
c(2,:) = imresize(1:size(a,2), [1 length(c)], 'nearest');
out = abs(mean(exp(1i*( a(:,c(1,:)) - a(:,c(2,:)) ))));
out = reshape(out,[size(a,2) size(a,2)]); % 129x129 format

MATLAB: Block matrix multiplying without loops

I have a block matrix [A B C...] and a matrix D (all 2-dimensional). D has dimensions y-by-y, and A, B, C, etc are each z-by-y. Basically, what I want to compute is the matrix [D*(A'); D*(B'); D*(C');...], where X' refers to the transpose of X. However, I want to accomplish this without loops for speed considerations.
I have been playing with the reshape command for several hours now, and I know how to use it in other cases, but this use case is different from the other ones and I cannot figure it out. I also would like to avoid using multi-dimensional matrices if at all possible.
Honestly, a loop is probably the best way to do it. In my image-processing work I found a well-written loop that takes advantage of Matlab's JIT compiler is often faster than all the extra overhead of manipulating the data to be able to use a vectorised operation. A loop like this:
[m n] = size(A);
T = zeros(m, n);
AT = A';
for ii=1:m:n
T(:, ii:ii+m-1) = D * AT(ii:ii+m-1, :);
end
contains only built-in operators and the bare minimum of copying, and given the JIT is going to be hard to beat. Even if you want to factor in interpreter overhead it's still only a single statement with no functions to consider.
The "loop-free" version with extra faffing around and memory copying, is to split the matrix and iterate over the blocks with a hidden loop:
blksize = size(D, 1);
blkcnt = size(A, 2) / blksize;
blocks = mat2cell(A, blksize, repmat(blksize,1,blkcnt));
blocks = cellfun(#(x) D*x', blocks, 'UniformOutput', false);
T = cell2mat(blocks);
Of course, if you have access to the Image Processing Toolbox, you can also cheat horribly:
T = blockproc(A, size(D), #(x) D*x.data');
Prospective approach & Solution Code
Given:
M is the block matrix [A B C...], where each A, B, C etc. are of size z x y. Let the number of such matrices be num_mat for easy reference later on.
If those matrices are concatenated along the columns, then M would be of size z x num_mat*y.
D is the matrix to be multiplied with each of those matrices A, B, C etc. and is of size y x y.
Now, as stated in the problem, the output you are after is [D*(A'); D*(B'); D*(C');...], i.e. the multiplication results being concatenated along the rows.
If you are okay with those multiplication results to be concatenated along the columns instead i.e. [D*(A') D*(B') D*(C') ...],
you can achieve the same with some reshaping and then performing the
matrix multiplications for the entire M with D and thus have a vectorized no-loop approach. Thus, to get such a matrix multiplication result, you can do -
mults = D*reshape(permute(reshape(M,z,y,[]),[2 1 3]),y,[]);
But, if you HAVE to get an output with the multiplication results being concatenated along the rows, you need to do some more reshaping like so -
out = reshape(permute(reshape(mults,y,z,[]),[1 3 2]),[],z);
Benchmarking
This section covers benchmarking codes comparing the proposed vectorized approach against a naive JIT powered loopy approach to get the desired output. As discussed earlier, depending on how the output array must hold the multiplication results, you can have two cases.
Case I: Multiplication results concatenated along the columns
%// Define size paramters and then define random inputs with those
z = 500; y = 500; num_mat = 500;
M = rand(z,num_mat*y);
D = rand(y,y);
%// Warm up tic/toc.
for k = 1:100000
tic(); elapsed = toc();
end
disp('---------------------------- With loopy approach')
tic
out1 = zeros(z,y*num_mat);
for k1 = 1:y:y*num_mat
out1(:,k1:k1+y-1) = D*M(:,k1:k1+y-1).'; %//'
end
toc, clear out1 k1
disp('---------------------------- With proposed approach')
tic
mults = D*reshape(permute(reshape(M,z,y,[]),[2 1 3]),y,[]);
toc
Case II: Multiplication results concatenated along the rows
%// Define size paramters and then define random inputs with those
z = 500; y = 500; num_mat = 500;
M = rand(z,num_mat*y);
D = rand(y,y);
%// Warm up tic/toc.
for k = 1:100000
tic(); elapsed = toc();
end
disp('---------------------------- With loopy approach')
tic
out1 = zeros(y*num_mat,z);
for k1 = 1:y:y*num_mat
out1(k1:k1+y-1,:) = D*M(:,k1:k1+y-1).'; %//'
end
toc, clear out1 k1
disp('---------------------------- With proposed approach')
tic
mults = D*reshape(permute(reshape(M,z,y,[]),[2 1 3]),y,[]);
out2 = reshape(permute(reshape(mults,y,z,[]),[1 3 2]),[],z);
toc
Runtimes
Case I:
---------------------------- With loopy approach
Elapsed time is 3.889852 seconds.
---------------------------- With proposed approach
Elapsed time is 3.051376 seconds.
Case II:
---------------------------- With loopy approach
Elapsed time is 3.798058 seconds.
---------------------------- With proposed approach
Elapsed time is 3.292559 seconds.
Conclusions
The runtimes suggest about a good 25% speedup with the proposed vectorized approach! So, hopefully this works out for you!
If you want to get A, B, and C from a bigger matrix you can do this, assuming the bigger matrix is called X:
A = X(:,1:y)
B = X(:,y+1:2*y)
C = X(:,2*y+1:3*y)
If there are N such matrices, the best way is to use reshape like:
F = reshape(X, x,y,N)
Then use a loop to generate a new matrix I call it F1 as:
F1=[];
for n=1:N
F1 = [F1 F(:,:,n)'];
end
Then compute F2 as:
F2 = D*F1;
and finally get your result as:
R = reshape(F2,N*y,x)
Note: this for loop does not slow you down as it is just to reformat the matrix and the multiplication is done in matrix form.

Mablab/Octave - use cellfun to index one matrix with another

I have a cell containing a random number of matrices, say a = {[300*20],....,[300*20]};. I have another cell of the same format, call it b, that contains the logicals of the position of the nan terms in a.
I want to use cellfun to loop through the cell and basically let the nan terms equal to 0 i.e. a(b)=0.
Thanks,
j
You could define a function that replaces any NaN with zero.
function a = nan2zero(a)
a(isnan(a)) = 0;
Then you can use cellfun to apply this function to your cell array.
a0 = cellfun(#nan2zero, a, 'UniformOutput', 0)
That way, you don't even need any matrices b.
First, you should probably give the tick to #s.bandara, as that was the first correct answer and it used cellfun (as you requested). Do NOT give it to this answer. The purpose of this answer is to provide some additional analysis.
I thought I'd look into the efficiency of some of the possible approaches to this problem.
The first approach is the one advocated by #s.bandara.
The second approach is similar to the one advocated by #s.bandara, but it uses b to convert nan to 0, rather than using isnan. In theory, this method may be faster, since nothing is assigned to b inside the function, so it should be treated "By Ref".
The third approach uses a loop to get around using cellfun, since cellfun is often slower than an explicit loop
The results of a quick speed test are:
Elapsed time is 3.882972 seconds. %# First approach (a, isnan, and cellfun, eg #s.bandara)
Elapsed time is 3.391190 seconds. %# Second approach (a, b, and cellfun)
Elapsed time is 3.041992 seconds. %# Third approach (loop-based solution)
In other words, there are (small) savings to be made by passing b in rather than using isnan. And there are further (small) savings to be made by using a loop rather than cellfun. But I wouldn't lose sleep over it. Remember, the results of any simulation are specific to the specified inputs.
Note, these results were consistent across several runs, I used tic and toc to do this, albeit with many loops over each method. If I wanted to be really thorough, I should use timeit from FEX. If anyone is interested, the code for the three methods follows:
%# Build some example matrices
T = 1000; N = 100; Q = 50; M = 100;
a = cell(1, Q); b = cell(1, Q);
for q = 1:Q
a{q} = randn(T, N);
b{q} = logical(randi(2, T, N) - 1);
a{q}(b{q}) = nan;
end
%# Solution using a, isnan, and cellfun (#s.bandara solution)
tic
for m = 1:M
Soln2 = cellfun(#f1, a, 'UniformOutput', 0);
end
toc
%# Solution using a, b, and cellfun
tic
for m = 1:M
Soln1 = cellfun(#f2, a, b, 'UniformOutput', 0);
end
toc
%# Solution using a loop to avoid cellfun
tic
for m = 1:M
Soln3 = cell(1, Q);
for q = 1:Q
Soln3{q} = a{q};
Soln3{q}(b{q}) = 0;
end
end
toc
%# Solution proposed by #EitanT
[K, N] = size(a{1});
tic
for m = 1:M
a0 = [a{:}]; %// Concatenate matrices along the 2nd dimension
a0(isnan(a0)) = 0; %// Replace NaNs with zeroes
Soln4 = mat2cell(a0, K, N * ones(size(a)));
end
toc
where:
function x1 = f1(x1)
x1(isnan(x1)) = 0;
and:
function x1 = f2(x1, x2)
x1(x2) = 0;
UPDATE: A fourth approach has been suggested by #EitanT. This approach concatenates the cell array of matrices into one large matrix, performs the operation on the large matrix, then optionally converts it back to a cell array. I have added the code for this procedure to my testing routine above. For the inputs specified in my testing code, ie T = 1000, N = 100, Q = 50, and M = 100, the timed run is as follows:
Elapsed time is 3.916690 seconds. %# #s.bandara
Elapsed time is 3.362319 seconds. %# a, b, and cellfun
Elapsed time is 2.906029 seconds. %# loop-based solution
Elapsed time is 4.986837 seconds. %# #EitanT
I was somewhat surprised by this as I thought the approach of #EitanT would yield the best results. On paper, it seems extremely sensible. Note, we can of course mess around with the input parameters to find specific settings that advantage different solutions. For example, if the matrices are small, but the number of them is large, then the approach of #EitanT does well, eg T = 10, N = 5, Q = 500, and M = 100 yields:
Elapsed time is 0.362377 seconds. %# #s.bandara
Elapsed time is 0.299595 seconds. %# a, b, and cellfun
Elapsed time is 0.352112 seconds. %# loop-based solution
Elapsed time is 0.030150 seconds. %# #EitanT
Here the approach of #EitanT dominates.
For the scale of the problem indicated by the OP, I found that the loop based solution usually had the best performance. However, for some Q, eg Q = 5, the solution of #EitanT managed to edge ahead.
Hmm.
Given the nature of the contents of your cell array, there may exist an even faster solution: you can convert your cell data to a single matrix and use vector indexing to replace all NaN values in it at once, without the need of cellfun or loops:
a0 = [a{:}]; %// Concatenate matrices along the 2nd dimension
a0(isnan(a0)) = 0; %// Replace NaNs with zeroes
If you want to convert it back to a cell array, that's fine:
[M, N] = size(a{1});
mat2cell(a0, M, N * ones(size(a)))
P.S.
Work with a 3-D matrix instead of a cell array, if possible. Vectorized operations are usually much faster in MATLAB.

Efficient way in MATLAB to apply the same left and right matrix multiplication to a large set of matrices

I have a lot of 2-by-2 matrices S1, S2, ..., SN, and on each of those matrices, I want to perform a left and right matrix multiplication as in R*S*R^T, where R is also a 2-by-2 matrix. Obviously I could just write this with a for loop, but I anticipate it being very slow for large N in MATLAB. Is there a simple and efficient way to accomplish this without using a for loop? Thanks in Advance!
Your biggest problem is not the loops. For matrices so small calling MATLABs A*B introduces a lot of overhead. The best thing you can do is to store all the matrices in a large 4 x n_matrices matrix and spell out the matrix multiplications manually:
A = rand(4, 1000);
B = rand(4, 1000);
tic;
C = zeros(size(A));
C(1,:) = A(1,:).*B(1,:) + A(3,:).*B(2,:);
C(2,:) = A(2,:).*B(1,:) + A(4,:).*B(2,:);
C(3,:) = A(1,:).*B(3,:) + A(3,:).*B(4,:);
C(4,:) = A(2,:).*B(3,:) + A(4,:).*B(4,:);
toc
Elapsed time is 0.020950 seconds.
As you see, this takes little time (this is a 6-years old desktop PC). For small matrices like this it is practical and I can not imagine anything else written in MATLAB that could beat this performance-wise. Well, for very large number of 2x2 matrices you could introduce blocking (i.e., handle only a number of matrices at a time) to enhance cache reuse.
I would say that the cycle here is not that bad and not that slow, consider this
N = 1000000
S = cell(1,N);
Out = S;
A = rand(2);
B = rand(2);
for i = 1 : N
S{i} = rand(2);
end
tic
for i = 1 : N
Out{i} = A * S{i} * B;
end
toc
tic
f = #(i) A*i*B;
Out = cellfun(f,S,'UniformOutput' , false);
toc
N =
1000000
Elapsed time is 2.609569 seconds.
Elapsed time is 9.871200 seconds.
You may think of performing a cat of your 2x2 matrices and then performing just 2 multiplications (transposing correctly on the way). But you will loose time in catting.