Calculate a function over all permutations of columns - matlab

I have this code:
abs(mean(exp(1i*( a(:,1) - a(:,2) ))))
where a is a 550-by-129 double matrix. How can I write code using that code to replace a(:,1) with a(:,2) and then a(:,3) and so on because I need each column to subtract from every other column?

Another method using matrix multiplication:
E = exp(1i*a);
result = abs(E.'*(1./E)/size(E,1));
Explanation:
You can rewrite the expression
exp(1i*( a - b) ))
as
exp(1i*a)/exp(1i*b)
so
exp(1i*a)*(1/exp(1i*b))
and mean(x) is sum(x)/n
Using that you can do your task using very fast matrix multiplication.
Result of a comparison between different methods in Octave:
Matrix Multiplication:
Elapsed time is 0.0133181 seconds.
BSXFUN:
Elapsed time is 1.33882 seconds.
REPMAT:
Elapsed time is 1.43535 seconds.
FOR LOOP:
Elapsed time is 3.10798 seconds.
Here is the code for comparing different methods.

Looped, this is an easy trick; let an outer loop run over all indices, and an inner loop as well.
a = rand(550,129);
out = zeros(size(a,2),size(a,2));
for ii = 1:size(a,2)
for jj = 1:size(a,2)
out(ii,jj) = abs(mean(exp(1i*(a(:,ii)-a(:,jj)))));
end
end

No loops, one line:
result = permute(abs(mean(exp(1i*bsxfun(#minus, a, permute(a, [1 3 2]))),1)), [2 3 1]);
This computes all pairs of row differences as a 3D array, where the second and third dimensions refer to the two row indices in the original 2D arrays; then applies the required operations along the first dimension; and finally permutes the dimensions to yield a 2D array result.

a bit off-topic, but you can do that with indexing too
a = rand(550,129);
c = repmat(1:size(a,2),1,size(a,2));
c(2,:) = imresize(1:size(a,2), [1 length(c)], 'nearest');
out = abs(mean(exp(1i*( a(:,c(1,:)) - a(:,c(2,:)) ))));
out = reshape(out,[size(a,2) size(a,2)]); % 129x129 format

Related

using reshape for a mean in a 3D matrix

I have a 3D (x,y,nframes) matrix/ movie ( named ch), and a logical mask (x,y). I want to do the mean of the mask pixels in each frame , at the end I get a vector of dim 1xnframes. And I want to do it with the reshape instead than frame by frame, but both results doesn't match and I don't understand why... Could you please let me know why???
for i=1:nframes
ch_aux=ch(:,:,i);
Mean_for(i)= mean(ch_aux(mask));
end
% C with reshape
[row,col] = find(mask);
A=ch(row,col,:);
B=reshape(A,1,length(row).^2,nframes );
Mean_res=mean(B);
plot( Mean_for,'r')
hold on
plot( Mean_res(:))
legend({'for','reshape'})
thanks!
Solution:
Using reshape
r = reshape( ch, [], size(ch,3) );
Mean_res = mean( r(mask(:),: ), 2 );
Benchmarking (comparing this solution to the two proposed by Divakar) can be found here showing:
Shai
Elapsed time is 0.0234721 seconds.
Divakar 1
Elapsed time is 0.743586 seconds.
Divakar 2
Elapsed time is 0.025841 seconds.
bsxfun is significantly slower,
What caused the error in the original code?
I suspect your problem lies in the expression A=ch(row, col,:);:
Suppose ch is of size 2-by-2-by-n and mask = [ 1 0; 0 1];, in that case
[rox, col] = find(mask);
Results with
row = [1,2];
col = [1,2];
And, obviously, A=ch(row,col,:); results with A equals ch exactly, which is not what you want...
For efficiency, you could use another vectorized solution with bsxfun alongwith your favourite reshape -
Mean_bsxfun = sum(reshape(bsxfun(#times,ch,mask),[],size(ch,3)),1)./sum(mask(:))
Or even better, abuse the fast matrix multiplication in MATLAB -
Mean_matmult = mask(:).'*reshape(ch,[],size(ch,3))./sum(mask(:))

How to profile a vector outer product in matlab

during my matlab profiling, i noticed one line of code that consumes much more time than i imagined. Any idea how to make it faster?
X = Y(ids_A, ids_A) - (Y(ids_A,k) * Y(k,ids_A))/Y(k,k);
X, and Y are symmetric matrices with the same size (dxd), k is an index of a single row/column in Y, ids_A is a vector of indices of all the other rows/columns( therefore Y(ids_A,k) is a column vector and Y(k,ids_A) is a row vector)
ids_A = setxor(1:d,k);
Thanks!
You can perhaps replace the outer product multiplication with a call to bsxfun:
X = Y(ids_A, ids_A) - (bsxfun(#times, Y(ids_A,k), Y(k,ids_A))/Y(k,k));
So how does the above code work? Let's take a look at the definition of the outer product when one vector is 4 elements and the other 3 elements:
Source: Wikipedia
As you can see, the outer product is created by element-wise products where the first vector u is replicated horizontally while the second vector v is replicated vertically. You then find the element-wise products of each element to produce your result. This is eloquently done with bsxfun:
bsxfun(#times, u, v.');
u would be a column vector and v.' would be a row vector. bsxfun naturally replicates the data to follow the above pattern, and then we use #times to perform the element-wise products.
I am assuming your code to look something like this -
for k = 1:d
ids_A = setxor(1:d,k);
X = Y(ids_A, ids_A) - (Y(ids_A,k) * Y(k,ids_A))/Y(k,k);
end
With the given code snippet, it's safe to assume that you are somehow using X within that loop. You can calculate all the X matrices as a pre-calculation step before the start of such a loop and these calculations could be performed as a vectorized approach.
Regarding the code snippet itself, it can be seen that you are "escaping" one index at each iteration with setxor. Now, if you are going with a vectorized approach, you can perform all those mathematical operations in one-go and later on remove the elements that got incorporated in the vectorized approach, but weren't intended. This really is the essence of a bsxfun based vectorized approach listed next -
%// Perform all matrix-multiplications in one go with bsxfun and permute
mults = bsxfun(#times,permute(Y,[1 3 2]),permute(Y,[3 2 1]));
%// Scale those with diagonal elements from Y and get X for every iteration
scaledvals = bsxfun(#rdivide,mults,permute(Y(1:d+1:end),[1 3 2]));
X_vectorized = bsxfun(#minus,Y,scaledvals);
%// Find row and column indices as linear indices to be removed from X_all
row_idx = bsxfun(#plus,[0:d-1]*d+1,[0:d-1]'*(d*d+1));
col_idx = bsxfun(#plus,[1:d]',[0:d-1]*(d*(d+1)));
%// Remove those "setxored" indices and then reshape to expected size
X_vectorized([row_idx col_idx])=[];
X_vectorized = reshape(X_vectorized,d-1,d-1,d);
Benchmarking
Benchmarking Code
d = 50; %// Datasize
Y = rand(d,d); %// Create random input
num_iter = 100; %// Number of iterations to be run for each approach
%// Warm up tic/toc.
for k = 1:100000
tic(); elapsed = toc();
end
disp('------------------------------ With original loopy approach')
tic
for iter = 1:num_iter
for k = 1:d
ids_A = setxor(1:d,k);
X = Y(ids_A, ids_A) - (Y(ids_A,k) * Y(k,ids_A))/Y(k,k);
end
end
toc
clear X k ids_A
disp('------------------------------ With proposed vectorized approach')
tic
for iter = 1:num_iter
mults = bsxfun(#times,permute(Y,[1 3 2]),permute(Y,[3 2 1]));
scaledvals = bsxfun(#rdivide,mults,permute(Y(1:d+1:end),[1 3 2]));
X_vectorized = bsxfun(#minus,Y,scaledvals);
row_idx = bsxfun(#plus,[0:d-1]*d+1,[0:d-1]'*(d*d+1));
col_idx = bsxfun(#plus,[1:d]',[0:d-1]*(d*(d+1)));
X_vectorized([row_idx col_idx])=[];
X_vectorized = reshape(X_vectorized,d-1,d-1,d);
end
toc
Results
Case #1: d = 50
------------------------------ With original loopy approach
Elapsed time is 0.849518 seconds.
------------------------------ With proposed vectorized approach
Elapsed time is 0.154395 seconds.
Case #2: d = 100
------------------------------ With original loopy approach
Elapsed time is 2.079886 seconds.
------------------------------ With proposed vectorized approach
Elapsed time is 2.285884 seconds.
Case #1: d = 200
------------------------------ With original loopy approach
Elapsed time is 7.592865 seconds.
------------------------------ With proposed vectorized approach
Elapsed time is 19.012421 seconds.
Conclusions
One can easily notice that the proposed vectorized approach might be a better choice when dealing with matrixes of sizes upto 100 x 100 beyond which
the memory-hungry bsxfun slows us down.

MATLAB: Block matrix multiplying without loops

I have a block matrix [A B C...] and a matrix D (all 2-dimensional). D has dimensions y-by-y, and A, B, C, etc are each z-by-y. Basically, what I want to compute is the matrix [D*(A'); D*(B'); D*(C');...], where X' refers to the transpose of X. However, I want to accomplish this without loops for speed considerations.
I have been playing with the reshape command for several hours now, and I know how to use it in other cases, but this use case is different from the other ones and I cannot figure it out. I also would like to avoid using multi-dimensional matrices if at all possible.
Honestly, a loop is probably the best way to do it. In my image-processing work I found a well-written loop that takes advantage of Matlab's JIT compiler is often faster than all the extra overhead of manipulating the data to be able to use a vectorised operation. A loop like this:
[m n] = size(A);
T = zeros(m, n);
AT = A';
for ii=1:m:n
T(:, ii:ii+m-1) = D * AT(ii:ii+m-1, :);
end
contains only built-in operators and the bare minimum of copying, and given the JIT is going to be hard to beat. Even if you want to factor in interpreter overhead it's still only a single statement with no functions to consider.
The "loop-free" version with extra faffing around and memory copying, is to split the matrix and iterate over the blocks with a hidden loop:
blksize = size(D, 1);
blkcnt = size(A, 2) / blksize;
blocks = mat2cell(A, blksize, repmat(blksize,1,blkcnt));
blocks = cellfun(#(x) D*x', blocks, 'UniformOutput', false);
T = cell2mat(blocks);
Of course, if you have access to the Image Processing Toolbox, you can also cheat horribly:
T = blockproc(A, size(D), #(x) D*x.data');
Prospective approach & Solution Code
Given:
M is the block matrix [A B C...], where each A, B, C etc. are of size z x y. Let the number of such matrices be num_mat for easy reference later on.
If those matrices are concatenated along the columns, then M would be of size z x num_mat*y.
D is the matrix to be multiplied with each of those matrices A, B, C etc. and is of size y x y.
Now, as stated in the problem, the output you are after is [D*(A'); D*(B'); D*(C');...], i.e. the multiplication results being concatenated along the rows.
If you are okay with those multiplication results to be concatenated along the columns instead i.e. [D*(A') D*(B') D*(C') ...],
you can achieve the same with some reshaping and then performing the
matrix multiplications for the entire M with D and thus have a vectorized no-loop approach. Thus, to get such a matrix multiplication result, you can do -
mults = D*reshape(permute(reshape(M,z,y,[]),[2 1 3]),y,[]);
But, if you HAVE to get an output with the multiplication results being concatenated along the rows, you need to do some more reshaping like so -
out = reshape(permute(reshape(mults,y,z,[]),[1 3 2]),[],z);
Benchmarking
This section covers benchmarking codes comparing the proposed vectorized approach against a naive JIT powered loopy approach to get the desired output. As discussed earlier, depending on how the output array must hold the multiplication results, you can have two cases.
Case I: Multiplication results concatenated along the columns
%// Define size paramters and then define random inputs with those
z = 500; y = 500; num_mat = 500;
M = rand(z,num_mat*y);
D = rand(y,y);
%// Warm up tic/toc.
for k = 1:100000
tic(); elapsed = toc();
end
disp('---------------------------- With loopy approach')
tic
out1 = zeros(z,y*num_mat);
for k1 = 1:y:y*num_mat
out1(:,k1:k1+y-1) = D*M(:,k1:k1+y-1).'; %//'
end
toc, clear out1 k1
disp('---------------------------- With proposed approach')
tic
mults = D*reshape(permute(reshape(M,z,y,[]),[2 1 3]),y,[]);
toc
Case II: Multiplication results concatenated along the rows
%// Define size paramters and then define random inputs with those
z = 500; y = 500; num_mat = 500;
M = rand(z,num_mat*y);
D = rand(y,y);
%// Warm up tic/toc.
for k = 1:100000
tic(); elapsed = toc();
end
disp('---------------------------- With loopy approach')
tic
out1 = zeros(y*num_mat,z);
for k1 = 1:y:y*num_mat
out1(k1:k1+y-1,:) = D*M(:,k1:k1+y-1).'; %//'
end
toc, clear out1 k1
disp('---------------------------- With proposed approach')
tic
mults = D*reshape(permute(reshape(M,z,y,[]),[2 1 3]),y,[]);
out2 = reshape(permute(reshape(mults,y,z,[]),[1 3 2]),[],z);
toc
Runtimes
Case I:
---------------------------- With loopy approach
Elapsed time is 3.889852 seconds.
---------------------------- With proposed approach
Elapsed time is 3.051376 seconds.
Case II:
---------------------------- With loopy approach
Elapsed time is 3.798058 seconds.
---------------------------- With proposed approach
Elapsed time is 3.292559 seconds.
Conclusions
The runtimes suggest about a good 25% speedup with the proposed vectorized approach! So, hopefully this works out for you!
If you want to get A, B, and C from a bigger matrix you can do this, assuming the bigger matrix is called X:
A = X(:,1:y)
B = X(:,y+1:2*y)
C = X(:,2*y+1:3*y)
If there are N such matrices, the best way is to use reshape like:
F = reshape(X, x,y,N)
Then use a loop to generate a new matrix I call it F1 as:
F1=[];
for n=1:N
F1 = [F1 F(:,:,n)'];
end
Then compute F2 as:
F2 = D*F1;
and finally get your result as:
R = reshape(F2,N*y,x)
Note: this for loop does not slow you down as it is just to reformat the matrix and the multiplication is done in matrix form.

Mablab/Octave - use cellfun to index one matrix with another

I have a cell containing a random number of matrices, say a = {[300*20],....,[300*20]};. I have another cell of the same format, call it b, that contains the logicals of the position of the nan terms in a.
I want to use cellfun to loop through the cell and basically let the nan terms equal to 0 i.e. a(b)=0.
Thanks,
j
You could define a function that replaces any NaN with zero.
function a = nan2zero(a)
a(isnan(a)) = 0;
Then you can use cellfun to apply this function to your cell array.
a0 = cellfun(#nan2zero, a, 'UniformOutput', 0)
That way, you don't even need any matrices b.
First, you should probably give the tick to #s.bandara, as that was the first correct answer and it used cellfun (as you requested). Do NOT give it to this answer. The purpose of this answer is to provide some additional analysis.
I thought I'd look into the efficiency of some of the possible approaches to this problem.
The first approach is the one advocated by #s.bandara.
The second approach is similar to the one advocated by #s.bandara, but it uses b to convert nan to 0, rather than using isnan. In theory, this method may be faster, since nothing is assigned to b inside the function, so it should be treated "By Ref".
The third approach uses a loop to get around using cellfun, since cellfun is often slower than an explicit loop
The results of a quick speed test are:
Elapsed time is 3.882972 seconds. %# First approach (a, isnan, and cellfun, eg #s.bandara)
Elapsed time is 3.391190 seconds. %# Second approach (a, b, and cellfun)
Elapsed time is 3.041992 seconds. %# Third approach (loop-based solution)
In other words, there are (small) savings to be made by passing b in rather than using isnan. And there are further (small) savings to be made by using a loop rather than cellfun. But I wouldn't lose sleep over it. Remember, the results of any simulation are specific to the specified inputs.
Note, these results were consistent across several runs, I used tic and toc to do this, albeit with many loops over each method. If I wanted to be really thorough, I should use timeit from FEX. If anyone is interested, the code for the three methods follows:
%# Build some example matrices
T = 1000; N = 100; Q = 50; M = 100;
a = cell(1, Q); b = cell(1, Q);
for q = 1:Q
a{q} = randn(T, N);
b{q} = logical(randi(2, T, N) - 1);
a{q}(b{q}) = nan;
end
%# Solution using a, isnan, and cellfun (#s.bandara solution)
tic
for m = 1:M
Soln2 = cellfun(#f1, a, 'UniformOutput', 0);
end
toc
%# Solution using a, b, and cellfun
tic
for m = 1:M
Soln1 = cellfun(#f2, a, b, 'UniformOutput', 0);
end
toc
%# Solution using a loop to avoid cellfun
tic
for m = 1:M
Soln3 = cell(1, Q);
for q = 1:Q
Soln3{q} = a{q};
Soln3{q}(b{q}) = 0;
end
end
toc
%# Solution proposed by #EitanT
[K, N] = size(a{1});
tic
for m = 1:M
a0 = [a{:}]; %// Concatenate matrices along the 2nd dimension
a0(isnan(a0)) = 0; %// Replace NaNs with zeroes
Soln4 = mat2cell(a0, K, N * ones(size(a)));
end
toc
where:
function x1 = f1(x1)
x1(isnan(x1)) = 0;
and:
function x1 = f2(x1, x2)
x1(x2) = 0;
UPDATE: A fourth approach has been suggested by #EitanT. This approach concatenates the cell array of matrices into one large matrix, performs the operation on the large matrix, then optionally converts it back to a cell array. I have added the code for this procedure to my testing routine above. For the inputs specified in my testing code, ie T = 1000, N = 100, Q = 50, and M = 100, the timed run is as follows:
Elapsed time is 3.916690 seconds. %# #s.bandara
Elapsed time is 3.362319 seconds. %# a, b, and cellfun
Elapsed time is 2.906029 seconds. %# loop-based solution
Elapsed time is 4.986837 seconds. %# #EitanT
I was somewhat surprised by this as I thought the approach of #EitanT would yield the best results. On paper, it seems extremely sensible. Note, we can of course mess around with the input parameters to find specific settings that advantage different solutions. For example, if the matrices are small, but the number of them is large, then the approach of #EitanT does well, eg T = 10, N = 5, Q = 500, and M = 100 yields:
Elapsed time is 0.362377 seconds. %# #s.bandara
Elapsed time is 0.299595 seconds. %# a, b, and cellfun
Elapsed time is 0.352112 seconds. %# loop-based solution
Elapsed time is 0.030150 seconds. %# #EitanT
Here the approach of #EitanT dominates.
For the scale of the problem indicated by the OP, I found that the loop based solution usually had the best performance. However, for some Q, eg Q = 5, the solution of #EitanT managed to edge ahead.
Hmm.
Given the nature of the contents of your cell array, there may exist an even faster solution: you can convert your cell data to a single matrix and use vector indexing to replace all NaN values in it at once, without the need of cellfun or loops:
a0 = [a{:}]; %// Concatenate matrices along the 2nd dimension
a0(isnan(a0)) = 0; %// Replace NaNs with zeroes
If you want to convert it back to a cell array, that's fine:
[M, N] = size(a{1});
mat2cell(a0, M, N * ones(size(a)))
P.S.
Work with a 3-D matrix instead of a cell array, if possible. Vectorized operations are usually much faster in MATLAB.

Use a vector to index a matrix without linear index

G'day,
I'm trying to find a way to use a vector of [x,y] points to index from a large matrix in MATLAB.
Usually, I would convert the subscript points to the linear index of the matrix.(for eg. Use a vector as an index to a matrix) However, the matrix is 4-dimensional, and I want to take all of the elements of the 3rd and 4th dimensions that have the same 1st and 2nd dimension. Let me hopefully demonstrate with an example:
Matrix = nan(4,4,2,2); % where the dimensions are (x,y,depth,time)
Matrix(1,2,:,:) = 999; % note that this value could change in depth (3rd dim) and time (4th time)
Matrix(3,4,:,:) = 888; % note that this value could change in depth (3rd dim) and time (4th time)
Matrix(4,4,:,:) = 124;
Now, I want to be able to index with the subscripts (1,2) and (3,4), etc and return not only the 999 and 888 which exist in Matrix(:,:,1,1) but the contents which exist at Matrix(:,:,1,2),Matrix(:,:,2,1) and Matrix(:,:,2,2), and so on (IRL, the dimensions of Matrix might be more like size(Matrix) = (300 250 30 200)
I don't want to use linear indices because I would like the results to be in a similar vector fashion. For example, I would like a result which is something like:
ans(time=1)
999 888 124
999 888 124
ans(time=2)
etc etc etc
etc etc etc
I'd also like to add that due to the size of the matrix I'm dealing with, speed is an issue here - thus why I'd like to use subscript indices to index to the data.
I should also mention that (unlike this question: Accessing values using subscripts without using sub2ind) since I want all the information stored in the extra dimensions, 3 and 4, of the i and jth indices, I don't think that a slightly faster version of sub2ind still would not cut it..
I can think of three ways to go about this
Simple loop
Just loop over all the 2D indices you have, and use colons to access the remaining dimensions:
for jj = 1:size(twoDinds,1)
M(twoDinds(jj,1),twoDinds(jj,2),:,:) = rand;
end
Vectorized calculation of Linear indices
Skip sub2ind and vectorize the computation of linear indices:
% generalized for arbitrary dimensions of M
sz = size(M);
nd = ndims(M);
arg = arrayfun(#(x)1:x, sz(3:nd), 'UniformOutput', false);
[argout{1:nd-2}] = ndgrid(arg{:});
argout = cellfun(...
#(x) repmat(x(:), size(twoDinds,1),1), ...
argout, 'Uniformoutput', false);
twoDinds = kron(twoDinds, ones(prod(sz(3:nd)),1));
% the linear indices
inds = twoDinds(:,1) + ([twoDinds(:,2) [argout{:}]]-1) * cumprod(sz(1:3)).';
Sub2ind
Just use the ready-made tool that ships with Matlab:
inds = sub2ind(size(M), twoDinds(:,1), twoDinds(:,2), argout{:});
Speed
So which one's the fastest? Let's find out:
clc
M = nan(4,4,2,2);
sz = size(M);
nd = ndims(M);
twoDinds = [...
1 2
4 3
3 4
4 4
2 1];
tic
for ii = 1:1e3
for jj = 1:size(twoDinds,1)
M(twoDinds(jj,1),twoDinds(jj,2),:,:) = rand;
end
end
toc
tic
twoDinds_prev = twoDinds;
for ii = 1:1e3
twoDinds = twoDinds_prev;
arg = arrayfun(#(x)1:x, sz(3:nd), 'UniformOutput', false);
[argout{1:nd-2}] = ndgrid(arg{:});
argout = cellfun(...
#(x) repmat(x(:), size(twoDinds,1),1), ...
argout, 'Uniformoutput', false);
twoDinds = kron(twoDinds, ones(prod(sz(3:nd)),1));
inds = twoDinds(:,1) + ([twoDinds(:,2) [argout{:}]]-1) * cumprod(sz(1:3)).';
M(inds) = rand;
end
toc
tic
for ii = 1:1e3
twoDinds = twoDinds_prev;
arg = arrayfun(#(x)1:x, sz(3:nd), 'UniformOutput', false);
[argout{1:nd-2}] = ndgrid(arg{:});
argout = cellfun(...
#(x) repmat(x(:), size(twoDinds,1),1), ...
argout, 'Uniformoutput', false);
twoDinds = kron(twoDinds, ones(prod(sz(3:nd)),1));
inds = sub2ind(size(M), twoDinds(:,1), twoDinds(:,2), argout{:});
M(inds) = rand;
end
toc
Results:
Elapsed time is 0.004778 seconds. % loop
Elapsed time is 0.807236 seconds. % vectorized linear inds
Elapsed time is 0.839970 seconds. % linear inds with sub2ind
Conclusion: use the loop.
Granted, the tests above are largely influenced by JIT's failure to compile the two last loops, and the non-specificity to 4D arrays (the last two method also work on ND arrays). Making a specialized version for 4D will undoubtedly be much faster.
Nevertheless, the indexing with simple loop is, well, simplest to do, easiest on the eyes and very fast too, thanks to JIT.
So, here is a possible answer... but it is messy. I suspect it would more computationally expensive then a more direct method... And this would definitely not be my preferred answer. It would be great if we could get the answer without any for loops!
Matrix = rand(100,200,30,400);
grabthese_x = (1 30 50 90);
grabthese_y = (61 9 180 189);
result=nan(size(length(grabthese_x),size(Matrix,3),size(Matrix,4));
for tt = 1:size(Matrix,4)
subset = squeeze(Matrix(grabthese_x,grabthese_y,:,tt));
for NN=1:size(Matrix,3)
result(:,NN,tt) = diag(subset(:,:,NN));
end
end
The resulting matrix, result should have size size(result) = (4 N tt).
I think this should work, even if Matrix isn't square. However, it is not ideal, as I said above.