How would you remove the loop from this matlab code - matlab

Given that we have:
x is 2d matrix with size [numSamples x numFeatures]
A is 2d square matrix with size [numFeatures x numFeatures]
B is a 1d vector with size [1 x numFeatures]
I would like to evaluate the following code without a loop: (or in a way faster way)
out = zeros(1,numSamples);
for i = 1:numSamples
res = sum(repmat(B - x(i,:), numSamples, 1)*A.*(x - repmat(x(i,:), numSamples, 1)), 2).^2;
out(i) = var(res);
end
If you have other suggestions on a faster improvement of the above, it is also more than welcome.

You can bsxfun those piece-by-piece for a vectorized solution -
P1 = bsxfun(#minus,B,x)*A;
P2 = bsxfun(#minus,x,permute(x,[3 2 1]));
out = var(squeeze(sum(bsxfun(#times,P1,P2),2)).^2.');
Partially vectorized approach -
P = (bsxfun(#minus,B,x)*A).'; %//'
out = zeros(1,numSamples);
for i = 1:numSamples
out(i) = var((bsxfun(#minus,x,x(i,:))*P(:,i)).^2);
end

Related

Tensor multiplication w/o looping in Matlab

I have a 3d array A, e.g. A=rand(N,N,K).
I need an array B s.t.
B(n,m) = norm(A(:,:,n)*A(:,:,m)' - A(:,:,m)*A(:,:,n)','fro')^2 for all indices n,m in 1:K.
Here's the looping code:
B = zeros(K,K);
for n=1:K
for m=1:K
B(n,m) = norm(A(:,:,n)*A(:,:,m)' - A(:,:,m)*A(:,:,n)','fro')^2;
end
end
I don't want to loop through 1:K.
I can create an array An_x_mt of size NK x NK s.t.
An_x_mt equals A(:,:,n)*A(:,:,m)' for all n,m in 1:K by
An_x_mt = Ar*Ac_t;
with
Ac_t=reshape(permute(A,[2 1 3]),size(A,1),[]);
Ar=Ac_t';
How do I create an array Am_x_nt also of size NK x NK s.t.
Am_x_nt equals A(:,:,m)*A(:,:,n)' for all n,m in 1:K
so that I could do
B = An_x_mt - Am_x_nt
B = reshape(B,N,N,[]);
B = reshape(squeeze(sum(sum(B.^2,1),2)),K,K);
Thx
For those who can't/won't use mmx and want to stick to pure Matlab code, here's how you could do it. mat2cell and cell2mat functions are your friends:
[N,~,nmat]=size(A);
Atc = reshape(permute(A,[2 1 3]),N,[]); % A', N x N*nmat
Ar = Atc'; % A, N*nmat x N
Anmt_2d = Ar*Atc; % An*Am'
Anmt_2d_cell = mat2cell(Anmt_2d,N*ones(nmat,1),N*ones(nmat,1));
Amnt_2d_cell = Anmt_2d_cell'; % ONLY products transposed, NOT their factors
Amnt_2d = cell2mat(Amnt_2d_cell); % Am*An'
Anm = Anmt_2d - Amnt_2d;
Anm = Anm.^2;
Anm_cell = mat2cell(Anm,N*ones(nmat,1),N*ones(nmat,1));
d = cellfun(#(c) sum(c(:)), Anm_cell); % squared Frobenius norm of each product; nmat x nmat
Alternatively, after computing Anmt_2d_cell and Amnt_2d_cell, you could convert them to 3d with the 3rd dimension encoding the (n,m) and (m,n) indices and then do the rest of the computations in 3d. You would need the permn() utility from here https://www.mathworks.com/matlabcentral/fileexchange/7147-permn-v-n-k
Anmt_3d = cat(3,Anmt_2d_cell);
Amnt_3d = cat(3,Amnt_2d_cell);
Anm_3d = Anmt_3d - Amnt_3d;
Anm_3d = Anm_3d.^2;
Anm = squeeze(sum(sum(Anm_3d,1),2));
d = zeros(nmat,nmat);
nm=permn(1:nmat, 2); % all permutations (n,m) with repeat, by-row order
d(sub2ind([nmat,nmat],nm(:,1),nm(:,2))) = Anm;
For some reason, the 2nd option (3D arrays) is twice faster.
Hopes this helps.

Matlab : Vectorize technics in 3 dimensions matrix

I actually vectorizing one of my code and I have some issues.
This is my initial code:
CoordVorBd = random(N+1,3)
CoordCP = random(N,3)
v = random(1,3)
for i = 1 : N
for j = 1 : N
ri1j = (-CoordVorBd (i,:) + CoordCP(j,:));
vij(i,j,:) = cross(v,ri1j))/(norm(ri1j)
end
end
I have start to vectorize that creating some matrix that contains 3*1 Vectors. My size of matrix is N*N*3.
CoordVorBd1(1:N,:) = CoordVorBd(2:N+1,:);
CoordCP_x= CoordCP(:,1);
CoordCP_y= CoordCP(:,2);
CoordCP_z= CoordCP(:,3);
CoordVorBd_x = CoordVorBd([1:N],1);
CoordVorBd_y = CoordVorBd([1:N],2);
CoordVorBd_z = CoordVorBd([1:N],3);
CoordVorBd1_x = CoordVorBd1(:,1);
CoordVorBd1_y = CoordVorBd1(:,2);
CoordVorBd1_z = CoordVorBd1(:,3);
[X,Y] = meshgrid (1:N);
ri1j_x = (-CoordVorBd_x(X) + CoordCP_x(Y));
ri1j_y = (-CoordVorBd_y(X) + CoordCP_y(Y));
ri1j_z = (-CoordVorBd_z(X) + CoordCP_z(Y));
ri1jmat(:,:,1) = ri1j_x(:,:);
ri1jmat(:,:,2) = ri1j_y(:,:);
ri1jmat(:,:,3) = ri1j_z(:,:);
vmat(:,:,1) = ones(N)*v(1);
vmat(:,:,2) = ones(N)*v(2);
vmat(:,:,3) = ones(N)*v(3);
This code works but is heavy in terms of variable creation. I did'nt achieve to apply the vectorization to all the matrix in one time.
The formule like
ri1jmat(X,Y,1:3) = (-CoordVorBd (X,:) + CoordCP(Y,:));
doesn't work...
If someone have some ideas to have something cleaner.
At this point I have a N*N*3 matrix ri1jmat with all my vectors.
I want to compute the N*N rij1norm matrix that is the norm of the vectors
rij1norm(i,j) = norm(ri1jmat(i,j,1:3))
to be able to vectorize the vij matrix.
vij(:,:,1:3) = (cross(vmat(:,:,1:3),ri1jmat(:,:,1:3))/(ri1jmatnorm(:,:));
The cross product works.
I tried numbers of method without achieve to have this rij1norm matrix without doing a double loop.
If someone have some tricks, thanks by advance.
Here's a vectorized version. Note your original loop didn't include the last column of CoordVorBd, so if that was intentional you need to remove it from the below code as well. I assumed it was a mistake.
CoordVorBd = rand(N+1,3);
CoordCP = rand(N,3);
v = rand(1,3);
repCoordVor=kron(CoordVorBd', ones(1,size(CoordCP,1)))'; %based on http://stackoverflow.com/questions/16266804/matlab-repeat-every-column-sequentially-n-times
repCoordCP=repmat(CoordCP, size(CoordVorBd,1),1); %repeat matrix
V2=-repCoordVor + repCoordCP; %your ri1j
nrm123=sqrt(sum(V2.^2,2)); %vectorized norm for each row
vij_unformatted=cat(3,(v(:,2).*V2(:,3) - V2(:,2).*v(:,3))./nrm123,(v(:,3).*V2(:,1) - V2(:,3).*v(:,1))./nrm123,(v(:,1).*V2(:,2) - V2(:,1).*v(:,2))./nrm123); % cross product, expanded, and each term divided by norm, could use bsxfun(#rdivide,cr123,nrm123) instead, if cr123 is same without divisions
vij=permute(reshape( vij_unformatted,N,N+1,3),[2,1,3]); %reformat to match your vij
Here is another way to do it using arrayfun
% Define a meshgrid of indices to run over
[I, J] = meshgrid(1:N, 1:(N+1));
% Calculate ril for each index
rilj = arrayfun(#(x, y) -CoordVorBd (y,:) + CoordCP(x,:), I, J, 'UniformOutput', false);
%Calculate vij for each point
temp_vij1 = arrayfun(#(x, y) cross(v, rilj{x, y}) / norm(rilj{x, y}), J, I, 'UniformOutput', false);
%Reshape the matrix into desired format
temp_vij2 = cell2mat(temp_vij1);
vij = cat(3, temp_vij2(:, 1:3:end), temp_vij2(:, 2:3:end), temp_vij2(:, 3:3:end));

Computing Mahalanobis Distance Between Set of Points and Set of Reference Points

I have an n x p matrix - mX which is composed of n points in R^p.
I have another m x p matrix - mY which is composed of m reference points in R^p.
I would like to create an n x m matrix - mD which is the Mahalanobis Distance matrix.
D(i, j) means the Mahalanobis Distance between point i in mX, mX(i, :) and point j in mY, mY(j, :).
Namely, is computes the following:
mD(i, j) = (mX(i, :) - mY(j, :)) * inv(mC) * (mX(i, :) - mY(j, :)).';
Where mC is the given Mahalanobis Distance PSD Matrix.
It is easy to be done in a loop, is there a way to vectorize it?
Namely, is the a function which its inputs are mX, mY and mC and its output is mD and fully vectorized without using any MATLAB toolbox?
Thank You.
Approach #1
Assuming infinite resources, here's one vectorized solution using bsxfun and matrix-multiplication -
A = reshape(bsxfun(#minus,permute(mX,[1 3 2]),permute(mY,[3 1 2])),[],p);
out = reshape(diag(A*inv(mC)*A.'),n,m);
Approach #2
Here's a comprise solution trying to reduce the loop complexity -
A = reshape(bsxfun(#minus,permute(mX,[1 3 2]),permute(mY,[3 1 2])),[],p);
imC = inv(mC);
out = zeros(n*m,1);
for ii = 1:n*m
out(ii) = A(ii,:)*imC*A(ii,:).';
end
out = reshape(out,n,m);
Sample run -
>> n = 3; m = 4; p = 5;
mX = rand(n,p);
mY = rand(m,p);
mC = rand(p,p);
imC = inv(mC);
>> %// Original solution
for i = 1:n
for j = 1:m
mD(i, j) = (mX(i, :) - mY(j, :)) * inv(mC) * (mX(i, :) - mY(j, :)).'; %//'
end
end
>> mD
mD =
-8.4256 10.032 2.8929 7.1762
-44.748 -4.3851 -13.645 -9.6702
-4.5297 3.2928 0.11132 2.5998
>> %// Approach #1
A = reshape(bsxfun(#minus,permute(mX,[1 3 2]),permute(mY,[3 1 2])),[],p);
out = reshape(diag(A*inv(mC)*A.'),n,m); %//'
>> out
out =
-8.4256 10.032 2.8929 7.1762
-44.748 -4.3851 -13.645 -9.6702
-4.5297 3.2928 0.11132 2.5998
>> %// Approach #2
A = reshape(bsxfun(#minus,permute(mX,[1 3 2]),permute(mY,[3 1 2])),[],p);
imC = inv(mC);
out1 = zeros(n*m,1);
for ii = 1:n*m
out1(ii) = A(ii,:)*imC*A(ii,:).'; %//'
end
out1 = reshape(out1,n,m);
>> out1
out1 =
-8.4256 10.032 2.8929 7.1762
-44.748 -4.3851 -13.645 -9.6702
-4.5297 3.2928 0.11132 2.5998
Instead if you had :
mD(j, i) = (mX(i, :) - mY(j, :)) * inv(mC) * (mX(i, :) - mY(j, :)).';
The solutions would translate to the versions listed next.
Approach #1
A = reshape(bsxfun(#minus,permute(mY,[1 3 2]),permute(mX,[3 1 2])),[],p);
out = reshape(diag(A*inv(mC)*A.'),m,n);
Approach #2
A = reshape(bsxfun(#minus,permute(mY,[1 3 2]),permute(mX,[3 1 2])),[],p);
imC = inv(mC);
out1 = zeros(m*n,1);
for i = 1:n*m
out(i) = A(i,:)*imC*A(i,:).'; %//'
end
out = reshape(out,m,n);
Sample run -
>> n = 3; m = 4; p = 5;
mX = rand(n,p); mY = rand(m,p); mC = rand(p,p); imC = inv(mC);
>> %// Original solution
for i = 1:n
for j = 1:m
mD(j, i) = (mX(i, :) - mY(j, :)) * inv(mC) * (mX(i, :) - mY(j, :)).'; %//'
end
end
>> mD
mD =
0.81755 0.33205 0.82254
1.7086 1.3363 2.4209
0.36495 0.78394 -0.33097
0.17359 0.3889 -1.0624
>> %// Approach #1
A = reshape(bsxfun(#minus,permute(mY,[1 3 2]),permute(mX,[3 1 2])),[],p);
out = reshape(diag(A*inv(mC)*A.'),m,n); %//'
>> out
out =
0.81755 0.33205 0.82254
1.7086 1.3363 2.4209
0.36495 0.78394 -0.33097
0.17359 0.3889 -1.0624
>> %// Approach #2
A = reshape(bsxfun(#minus,permute(mY,[1 3 2]),permute(mX,[3 1 2])),[],p);
imC = inv(mC);
out1 = zeros(m*n,1);
for i = 1:n*m
out1(i) = A(i,:)*imC*A(i,:).'; %//'
end
out1 = reshape(out1,m,n);
>> out1
out1 =
0.81755 0.33205 0.82254
1.7086 1.3363 2.4209
0.36495 0.78394 -0.33097
0.17359 0.3889 -1.0624
Here is one solution that eliminates one loop
function d = mahalanobis(mX, mY)
n = size(mX, 2);
m = size(mY, 2);
data = [mX, mY];
mc = cov(transpose(data));
dist = zeros(n,m);
for i = 1 : n
diff = repmat(mX(:,i), 1, m) - mY;
dist(i,:) = sum((mc\diff).*diff , 1);
end
d = sqrt(dist);
end
You would invoke it as:
d = mahalanobis(transpose(X),transpose(Y))
Reduce to L2
It seems that Mahalanobis Distance can be reduced to ordinary L2 distance if you are allowed to preprocess matrix mC and you are not afraid of numerical differences.
First of all, compute Cholesky decomposition of mC:
mR = chol(mC) % C = R^t * R, where R is upper-triangular
Now we can use these factors to reformulate Mahalanobis Distance:
(Xi-Yj) * inv(C) * (Xi-Yj)^t = || (Xi-Yj) inv(R) ||^2 = ||TXi - TYj||^2
where: TXi = Xi * inv(R)
TYj = Yj * inv(R)
So the idea is to transform points Xi, Yj to TXi, TYj first, and then compute euclidean distances between them. Here is the algorithm outline:
Compute mR - Cholesky factor of covariance matrix mC (takes O(p^3) time).
Invert triangular matrix mR (takes O(p^3) time).
Multiply both mX and mY by inv(mR) on the right (takes O(p^2 (m+n)) time).
Compute squared L2 distances between pairs of points (takes O(m n p) time).
Total time is O(m n p + (m + n) p^2 + p^3) versus original O(m n p^2). It should work faster when 1 << p << n,m. In such case step 4 would takes most of the time and should be vectorized.
Vectorization
I have little experience of MATLAB, but quite a lot of SIMD vectorization on x86 CPUs. In raw computations, it would be enough to vectorize along one sufficiently large array dimension, and make trivial loops for the other dimensions.
If you expect p to be large enough, it may probably be OK to vectorize along coordinates of points, and make two nested loops for i <= n and j <= m. That's similar to what #Daniel posted.
If p is not sufficiently large, you can vectorize along one of the point sequences instead. This would be similar to solution posted by #dpmcmlxxvi: you have to subtract single row of one matrix from all the rows of the second matrix, then compute squared norms of the resulting rows. Repeat n times (
or m times).
As for me, full vectorization (which means rewriting with matrix operations instead of loops in MATLAB) does not sound like a clever performance goal. Most likely partially vectorized solutions would be optimally fast.
I came to the conclusion that vectorizing this problem is not efficient. My best idea for vectorizing this problem would require m x n x p x p working memory, at least if everything is processed at once. This means with n=m=p=152 the code would already require 4GB Ram. At these dimensions, my system can run the loop in less than a second:
mD=zeros(size(mX,1),size(mY,1));
ImC=inv(mC);
for i=1:size(mX,1)
for j=1:size(mY,1)
d=mX(i, :) - mY(j, :);
mD(i, j) = (d) * ImC * (d).';
end
end

Vectorization - Sum and Bessel function

Can anyone help vectorize this Matlab code? The specific problem is the sum and bessel function with vector inputs.
Thank you!
N = 3;
rho_g = linspace(1e-3,1,N);
phi_g = linspace(0,2*pi,N);
n = 1:3;
tau = [1 2.*ones(1,length(n)-1)];
for ii = 1:length(rho_g)
for jj = 1:length(phi_g)
% Coordinates
rho_o = rho_g(ii);
phi_o = phi_g(jj);
% factors
fc = cos(n.*(phi_o-phi_s));
fs = sin(n.*(phi_o-phi_s));
Ez_t(ii,jj) = sum(tau.*besselj(n,k(3)*rho_s).*besselh(n,2,k(3)*rho_o).*fc);
end
end
You could try to vectorize this code, which might be possible with some bsxfun or so, but it would be hard to understand code, and it is the question if it would run any faster, since your code already uses vector math in the inner loop (even though your vectors only have length 3). The resulting code would become very difficult to read, so you or your colleague will have no idea what it does when you have a look at it in 2 years time.
Before wasting time on vectorization, it is much more important that you learn about loop invariant code motion, which is easy to apply to your code. Some observations:
you do not use fs, so remove that.
the term tau.*besselj(n,k(3)*rho_s) does not depend on any of your loop variables ii and jj, so it is constant. Calculate it once before your loop.
you should probably pre-allocate the matrix Ez_t.
the only terms that change during the loop are fc, which depends on jj, and besselh(n,2,k(3)*rho_o), which depends on ii. I guess that the latter costs much more time to calculate, so it better to not calculate this N*N times in the inner loop, but only N times in the outer loop. If the calculation based on jj would take more time, you could swap the for-loops over ii and jj, but that does not seem to be the case here.
The result code would look something like this (untested):
N = 3;
rho_g = linspace(1e-3,1,N);
phi_g = linspace(0,2*pi,N);
n = 1:3;
tau = [1 2.*ones(1,length(n)-1)];
% constant part, does not depend on ii and jj, so calculate only once!
temp1 = tau.*besselj(n,k(3)*rho_s);
Ez_t = nan(length(rho_g), length(phi_g)); % preallocate space
for ii = 1:length(rho_g)
% calculate stuff that depends on ii only
rho_o = rho_g(ii);
temp2 = besselh(n,2,k(3)*rho_o);
for jj = 1:length(phi_g)
phi_o = phi_g(jj);
fc = cos(n.*(phi_o-phi_s));
Ez_t(ii,jj) = sum(temp1.*temp2.*fc);
end
end
Initialization -
N = 3;
rho_g = linspace(1e-3,1,N);
phi_g = linspace(0,2*pi,N);
n = 1:3;
tau = [1 2.*ones(1,length(n)-1)];
Nested loops form (Copy from your code and shown here for comparison only) -
for ii = 1:length(rho_g)
for jj = 1:length(phi_g)
% Coordinates
rho_o = rho_g(ii);
phi_o = phi_g(jj);
% factors
fc = cos(n.*(phi_o-phi_s));
fs = sin(n.*(phi_o-phi_s));
Ez_t(ii,jj) = sum(tau.*besselj(n,k(3)*rho_s).*besselh(n,2,k(3)*rho_o).*fc);
end
end
Vectorized solution -
%%// Term - 1
term1 = repmat(tau.*besselj(n,k(3)*rho_s),[N*N 1]);
%%// Term - 2
[n1,rho_g1] = meshgrid(n,rho_g);
term2_intm = besselh(n1,2,k(3)*rho_g1);
term2 = transpose(reshape(repmat(transpose(term2_intm),[N 1]),N,N*N));
%%// Term -3
angle1 = repmat(bsxfun(#times,bsxfun(#minus,phi_g,phi_s')',n),[N 1]);
fc = cos(angle1);
%%// Output
Ez_t = sum(term1.*term2.*fc,2);
Ez_t = transpose(reshape(Ez_t,N,N));
Points to note about this vectorization or code simplification –
‘fs’ doesn’t change the output of the script, Ez_t, so it could be removed for now.
The output seems to be ‘Ez_t’,which requires three basic terms in the code as –
tau.*besselj(n,k(3)*rho_s), besselh(n,2,k(3)*rho_o) and fc. These are calculated separately for vectorization as terms1,2 and 3 respectively.
All these three terms appear to be of 1xN sizes. Our aim thus becomes to calculate these three terms without loops. Now, the two loops run for N times each, thus giving us a total loop count of NxN. Thus, we must have NxN times the data in each such term as compared to when these terms were inside the nested loops.
This is basically the essence of the vectorization done here, as the three terms are represented by ‘term1’,’term2’ and ‘fc’ itself.
In order to give a self-contained answer, I'll copy the original initialization
N = 3;
rho_g = linspace(1e-3,1,N);
phi_g = linspace(0,2*pi,N);
n = 1:3;
tau = [1 2.*ones(1,length(n)-1)];
and generate some missing data (k(3) and rho_s and phi_s in the dimension of n)
rho_s = rand(size(n));
phi_s = rand(size(n));
k(3) = rand(1);
then you can compute the same Ez_t with multidimensional arrays:
[RHO_G, PHI_G, N] = meshgrid(rho_g, phi_g, n);
[~, ~, TAU] = meshgrid(rho_g, phi_g, tau);
[~, ~, RHO_S] = meshgrid(rho_g, phi_g, rho_s);
[~, ~, PHI_S] = meshgrid(rho_g, phi_g, phi_s);
FC = cos(N.*(PHI_G - PHI_S));
FS = sin(N.*(PHI_G - PHI_S)); % not used
EZ_T = sum(TAU.*besselj(N, k(3)*RHO_S).*besselh(N, 2, k(3)*RHO_G).*FC, 3).';
You can check afterwards that both matrices are the same
norm(Ez_t - EZ_T)

Multiply a 3D matrix with a 2D matrix

Suppose I have an AxBxC matrix X and a BxD matrix Y.
Is there a non-loop method by which I can multiply each of the C AxB matrices with Y?
As a personal preference, I like my code to be as succinct and readable as possible.
Here's what I would have done, though it doesn't meet your 'no-loops' requirement:
for m = 1:C
Z(:,:,m) = X(:,:,m)*Y;
end
This results in an A x D x C matrix Z.
And of course, you can always pre-allocate Z to speed things up by using Z = zeros(A,D,C);.
You can do this in one line using the functions NUM2CELL to break the matrix X into a cell array and CELLFUN to operate across the cells:
Z = cellfun(#(x) x*Y,num2cell(X,[1 2]),'UniformOutput',false);
The result Z is a 1-by-C cell array where each cell contains an A-by-D matrix. If you want Z to be an A-by-D-by-C matrix, you can use the CAT function:
Z = cat(3,Z{:});
NOTE: My old solution used MAT2CELL instead of NUM2CELL, which wasn't as succinct:
[A,B,C] = size(X);
Z = cellfun(#(x) x*Y,mat2cell(X,A,B,ones(1,C)),'UniformOutput',false);
Here's a one-line solution (two if you want to split into 3rd dimension):
A = 2;
B = 3;
C = 4;
D = 5;
X = rand(A,B,C);
Y = rand(B,D);
%# calculate result in one big matrix
Z = reshape(reshape(permute(X, [2 1 3]), [A B*C]), [B A*C])' * Y;
%'# split into third dimension
Z = permute(reshape(Z',[D A C]),[2 1 3]);
Hence now: Z(:,:,i) contains the result of X(:,:,i) * Y
Explanation:
The above may look confusing, but the idea is simple.
First I start by take the third dimension of X and do a vertical concatenation along the first dim:
XX = cat(1, X(:,:,1), X(:,:,2), ..., X(:,:,C))
... the difficulty was that C is a variable, hence you can't generalize that expression using cat or vertcat. Next we multiply this by Y:
ZZ = XX * Y;
Finally I split it back into the third dimension:
Z(:,:,1) = ZZ(1:2, :);
Z(:,:,2) = ZZ(3:4, :);
Z(:,:,3) = ZZ(5:6, :);
Z(:,:,4) = ZZ(7:8, :);
So you can see it only requires one matrix multiplication, but you have to reshape the matrix before and after.
I'm approaching the exact same issue, with an eye for the most efficient method. There are roughly three approaches that i see around, short of using outside libraries (i.e., mtimesx):
Loop through slices of the 3D matrix
repmat-and-permute wizardry
cellfun multiplication
I recently compared all three methods to see which was quickest. My intuition was that (2) would be the winner. Here's the code:
% generate data
A = 20;
B = 30;
C = 40;
D = 50;
X = rand(A,B,C);
Y = rand(B,D);
% ------ Approach 1: Loop (via #Zaid)
tic
Z1 = zeros(A,D,C);
for m = 1:C
Z1(:,:,m) = X(:,:,m)*Y;
end
toc
% ------ Approach 2: Reshape+Permute (via #Amro)
tic
Z2 = reshape(reshape(permute(X, [2 1 3]), [A B*C]), [B A*C])' * Y;
Z2 = permute(reshape(Z2',[D A C]),[2 1 3]);
toc
% ------ Approach 3: cellfun (via #gnovice)
tic
Z3 = cellfun(#(x) x*Y,num2cell(X,[1 2]),'UniformOutput',false);
Z3 = cat(3,Z3{:});
toc
All three approaches produced the same output (phew!), but, surprisingly, the loop was the fastest:
Elapsed time is 0.000418 seconds.
Elapsed time is 0.000887 seconds.
Elapsed time is 0.001841 seconds.
Note that the times can vary quite a lot from one trial to another, and sometimes (2) comes out the slowest. These differences become more dramatic with larger data. But with much bigger data, (3) beats (2). The loop method is still best.
% pretty big data...
A = 200;
B = 300;
C = 400;
D = 500;
Elapsed time is 0.373831 seconds.
Elapsed time is 0.638041 seconds.
Elapsed time is 0.724581 seconds.
% even bigger....
A = 200;
B = 200;
C = 400;
D = 5000;
Elapsed time is 4.314076 seconds.
Elapsed time is 11.553289 seconds.
Elapsed time is 5.233725 seconds.
But the loop method can be slower than (2), if the looped dimension is much larger than the others.
A = 2;
B = 3;
C = 400000;
D = 5;
Elapsed time is 0.780933 seconds.
Elapsed time is 0.073189 seconds.
Elapsed time is 2.590697 seconds.
So (2) wins by a big factor, in this (maybe extreme) case. There may not be an approach that is optimal in all cases, but the loop is still pretty good, and best in many cases. It is also best in terms of readability. Loop away!
Nope. There are several ways, but it always comes out in a loop, direct or indirect.
Just to please my curiosity, why would you want that anyway ?
To answer the question, and for readability, please see:
ndmult, by ajuanpi (Juan Pablo Carbajal), 2013, GNU GPL
Input
2 arrays
dim
Example
nT = 100;
t = 2*pi*linspace (0,1,nT)’;
# 2 experiments measuring 3 signals at nT timestamps
signals = zeros(nT,3,2);
signals(:,:,1) = [sin(2*t) cos(2*t) sin(4*t).^2];
signals(:,:,2) = [sin(2*t+pi/4) cos(2*t+pi/4) sin(4*t+pi/6).^2];
sT(:,:,1) = signals(:,:,1)’;
sT(:,:,2) = signals(:,:,2)’;
G = ndmult (signals,sT,[1 2]);
Source
Original source. I added inline comments.
function M = ndmult (A,B,dim)
dA = dim(1);
dB = dim(2);
# reshape A into 2d
sA = size (A);
nA = length (sA);
perA = [1:(dA-1) (dA+1):(nA-1) nA dA](1:nA);
Ap = permute (A, perA);
Ap = reshape (Ap, prod (sA(perA(1:end-1))), sA(perA(end)));
# reshape B into 2d
sB = size (B);
nB = length (sB);
perB = [dB 1:(dB-1) (dB+1):(nB-1) nB](1:nB);
Bp = permute (B, perB);
Bp = reshape (Bp, sB(perB(1)), prod (sB(perB(2:end))));
# multiply
M = Ap * Bp;
# reshape back to original format
s = [sA(perA(1:end-1)) sB(perB(2:end))];
M = squeeze (reshape (M, s));
endfunction
I highly recommend you use the MMX toolbox of matlab. It can multiply n-dimensional matrices as fast as possible.
The advantages of MMX are:
It is easy to use.
Multiply n-dimensional matrices (actually it can multiply arrays of 2-D matrices)
It performs other matrix operations (transpose, Quadratic Multiply, Chol decomposition and more)
It uses C compiler and multi-thread computation for speed up.
For this problem, you just need to write this command:
C=mmx('mul',X,Y);
here is a benchmark for all possible methods. For more detail refer to this question.
1.6571 # FOR-loop
4.3110 # ARRAYFUN
3.3731 # NUM2CELL/FOR-loop/CELL2MAT
2.9820 # NUM2CELL/CELLFUN/CELL2MAT
0.0244 # Loop Unrolling
0.0221 # MMX toolbox <===================
I would like to share my answer to the problems of:
1) making the tensor product of two tensors (of any valence);
2) making the contraction of two tensors along any dimension.
Here are my subroutines for the first and second tasks:
1) tensor product:
function [C] = tensor(A,B)
C = squeeze( reshape( repmat(A(:), 1, numel(B)).*B(:).' , [size(A),size(B)] ) );
end
2) contraction:
Here A and B are the tensors to be contracted along the dimesions i and j respectively. The lengths of these dimensions should be equal, of course. There's no check for this (this would obscure the code) but apart from this it works well.
function [C] = tensorcontraction(A,B, i,j)
sa = size(A);
La = length(sa);
ia = 1:La;
ia(i) = [];
ia = [ia i];
sb = size(B);
Lb = length(sb);
ib = 1:Lb;
ib(j) = [];
ib = [j ib];
% making the i-th dimension the last in A
A1 = permute(A, ia);
% making the j-th dimension the first in B
B1 = permute(B, ib);
% making both A and B 2D-matrices to make use of the
% matrix multiplication along the second dimension of A
% and the first dimension of B
A2 = reshape(A1, [],sa(i));
B2 = reshape(B1, sb(j),[]);
% here's the implicit implication that sa(i) == sb(j),
% otherwise - crash
C2 = A2*B2;
% back to the original shape with the exception
% of dimensions along which we've just contracted
sa(i) = [];
sb(j) = [];
C = squeeze( reshape( C2, [sa,sb] ) );
end
Any critics?
I would think recursion, but that's the only other non- loop method you can do
You could "unroll" the loop, ie write out all the multiplications sequentially that would occur in the loop