I want to multiply elements of a matrix T against elements of two vectors vec_1 and vec_2, and sum everything up. Using nested for loops, I can do it like this:
T = eye(3);
vec_1 = [4,5,6];
vec_2 = [7,8,9];
tot = 0;
for m=1:3
for n=1:3
tot = tot + T(m,n) .* vec_1(m) .* vec_2(n);
end
end
I wanted to make it faster using vectorization so I tried the following.
T = eye(3);
vec_1 = [4,5,6];
vec_2 = [7,8,9];
f = #(m,n) T(m,n) .* vec_1(m) .* vec_2(n);
[M, N] = meshgrid(1:3,1:3);
tot = sum(f(M,N),'all');
However, this doesn't work and I get the error 'Matrix dimensions must agree.' From debugging it, the problem is due to T being evaluated using M and N. Instead of returning a 3x3 matrix as I expected, T(M,N) returns a 9x9 matrix. How can I fix this code so I can use vectorization instead of nested for loops for this task?
If T is always the identity matrix, then you are only looking at the trace of the outer product, which is just the inner product:
tot = vec_1 * vec_2'
If T could be arbitrary and not always the identity matrix, you could just code the outer product directly:
tot = sum(T.*(vec_1'*vec_2),'all')
Related
I am trying to evaluate the matrices Y(p,k) and Z(p,k) using the following simplified Matlab code.
They depend on some matrices A(j,k), B(j,p) and C(j,k) which I am able to precalculate, so I have just initialised them as random arrays for this MWE. (Note that B is a different size to A and C).
Nj = 5000;
Nk = 1000;
Np = 500; % max loop iterations
A = rand(Nj,Nk); % dummy precalculated matrices
B = rand(Nj,Np);
C = rand(Nj,Nk);
Y = zeros(Np,Nk); % allocate storage
Z = zeros(Np,Nk);
tic
for p = 1:Np
X = A .* B(:,p);
Y(p,:) = sum( X , 1 );
Z(p,:) = sum( C .* X , 1 );
end
toc % Evaluates to 11 seconds on my system
As can be seen above, I am repeating my calculation by looping over index p (because the matrix B depends on p).
I have managed to get this far by moving everything which can be precalculated outside the loop (contained in A, B and C), but on my system this code still takes around 11 seconds to execute. Can anyone see a way in Matlab to speed this up, or perhaps even remove the loop and process all at once?
Thank you
I think the following should be equivalent and much faster:
Y = B' * A;
Z = B' * (A.*C);
Notes:
If B is complex-valued then you should use .' for transposition instead.
You may also want to pre-compute B directly in transposed form (i.e. as a Np by Nj matrix) to avoid the transposition altogether.
If C is not needed anywhere else, then pre-compute it as A.*C instead in order to avoid the extra element-wise multiplication.
I wish to fill a N x M x W matrix ‘S’ with the data from matrices ‘P’ and ‘Q’. They are defined below and illustrated in the attached image. Also, we know for sure that n_1 + n_2 = N, m < M, so all the data may fit in the ‘S’ matrix.
S = zeros(M,N,W);
P = rand(m,n_1,W);
Q = rand(m,n_2,W);
I wish to combine ‘P’ and ‘Q’ in a manner specified by 3 other matrices, ‘Line_num’, ‘P_col’ and ‘Q_col’, described below and in the middle part of the attached image.
P_col = randperm(N); P_col = P_col(1:n_1); % 1 x n_1 matrix
Q_col = setxor(P_col, 1:1:N); % 1 x n_2 matrix
Line_num is a matrix composed of W vectors of the form aa:1:bb, where bb-aa = m and aa is chosen at random for each vector.
The important thing is that in this case the data along the 3rd dimension in all these matrixes represent W different test cases, with the data being different and not to be mixed between each other.
To fill ‘S’ one may proceed in two logical steps (although if it can be done in one I shall be glad)
combine Q and P into an intermediate matrix Y of shape m x N x W by
interweaving their columns. The columns specified in ‘Q_col’ are
taken from Q (using the vector index) and put in the matrix Y (using
the vector value). The same goes for P.
For each of the W vectors composing Line_num and arrays composing S,
use the values in the vector Line_num to spread out Y to the
corresponding rows in S, meanwhile maintaining their top to bottom
order.
I wish to achieve this without for-loops as I am looking to ‘vectorize’ my code and thus improve its running speed.
I have had a look at this post and this post, which are similar to what I desire. However they are simpler as the numbers to be extracted are constant. Maybe something similar would be appropriate?
Thank you for your help :)
Link to the image aforementioned
EDIT: here is an example code with a for-loop of what I want (my problem is that I want to get rid of the loop)
W = 4;
N = 10; n_1 = 4; n_2 = 6;
M = 20; m = 5;
P_col = [1,3,5,8]; % 1 x n_1 matrix
Q_col = setxor(P_col, 1:1:N); % 1 x n_2 matrix
line_num(:,:,1) = [1,5,10,15,18];
line_num(:,:,2) = [2,3,8,11,12];
line_num(:,:,3) = [4,7,8,14,19];
line_num(:,:,4) = [2,6,13,15,16];
S = zeros(M,N,W);
P = rand(m,n_1,W);
Q = rand(m,n_2,W);
for w=1:W
line_num_I = line_num(:,:,w);
S(line_num_I,Q_col,w) = Q(:,:,w);
S(line_num_I,P_col,w) = P(:,:,w);
end
Here is a vectorized solution. I'm not sure if it is more efficient than loop version specially when the size of data is large.
S ( reshape(line_num,[],1,W) ...
+ ([Q_col-1 P_col-1]) * M ...
+ (reshape(0:W-1,1,1,[]))*M*N ...
) ...
= ...
[reshape(Q,[],W);reshape(P,[],W)];
Here implicit expansion is used to convert subscripts to indices. Equivalently bsxfun can be used to compute linear indices:
S ( ...
bsxfun(#plus, ...
reshape(line_num,[],1,W), ...
bsxfun (#plus, ...
([Q_col-1 P_col-1]) * M, ...
(reshape(0:W-1,1,1,[]))*M*N ...
) ...
) ...
) ...
= ...
[reshape(Q,[],W);reshape(P,[],W)];
*Here You can find how to convert 3D index to lindex.
So I ended up finding the answer. For those of you that it may interest, the above for-loop may be replaced by
% 1. Combine columns
mixed_col = zeros(m,N,W);
mixed_col(:,Q_col,:) = Q(:,:,:);
mixed_col(:,P_col,:) = P(:,:,:);
mixed_col = permute(mixed_col,[2,1,3]); % turn 3D matrix into 2D [1]
mixed_col = reshape(mixed_col,N,[],1)';
% 2. Combine lines
S = reshape(S,M*w,N,1); % turn 3D matrix into 2D [2]
line_num_v = permute(line_num + reshape((0:1:(W-1)).*M,1,1,W),[2,1,3]); % turn 3D matrix into 2D [3]
line_num_v = reshape(line_num_v,[],1,1);
S(line_num_v,:) = mixed_col(:,:); % combine using three 2D matrices
S = permute(reshape(S',N,M,W),[2,1,3]);
This involves lots of reshaping but I don't have a simpler answer.
Thanks again for your help.
In MatLab, I have a matrix SimC which has dimension 22 x 4. I re-generate this matrix 10 times using a for loop.
I want to end up with a matrix U that contains SimC(1) in rows 1 to 22, SimC(2) in rows 23 to 45 and so on. Hence U should have dimension 220 x 4 in the end.
Thank you!!
Edit:
nTrials = 10;
n = 22;
U = zeros(nTrials * n , 4) %Dimension of the final output matrix
for i = 1 : nTrials
SimC = SomeSimulation() %This generates an nx4 matrix
U = vertcat(SimC)
end
Unfortunately the above doesn't work as U = vertcat(SimC) only gives back SimC instead of concatenating.
vertcat is a good choice, but it will result in a growing matrix. This is not good practice on larger programs because it can really slow down. In your problem, though, you aren't looping through too many times, so vertcat is fine.
To use vertcat, you would NOT pre-allocate the full final size of the U matrix...just create an empty U. Then, when invoking vertcat, you need to give it both matrices that you want to concatenate:
nTrials = 10;
n = 22;
U = [] %create an empty output matrix
for i = 1 : nTrials
SimC = SomeSimulation(); %This generates an nx4 matrix
U = vertcat(U,SimC); %concatenate the two matrices
end
The better way to do this, since you already know the final size, is to pre-allocate your full U (as you did) and then put your values into U via computing the correct indices. Something like this:
nTrials = 10;
n = 22;
U = U = zeros(nTrials * n , 4); %create a full output matrix
for i = 1 : nTrials
SimC = SomeSimulation(); %This generates an nx4 matrix
indices = (i-1)*n+[1:n]; %here are the rows where you want to put the latest output
U(indices,:)=SimC; %copies SimC into the correct rows of U
end
I need to numerically evaluate some integrals which are all of the form shown in this image:
These integrals are the matrix elements of a N x N matrix, so I need to evaluate them for all possible combinations of n and m in the range of 1 to N. The integrals are symmetric in n and m which I have implemented in my current nested for loop approach:
function [V] = coulomb3(N, l, R, R0, c, x)
r1 = 0.01:x:R;
r2 = R:x:R0;
r = [r1 r2];
rl1 = r1.^(2*l);
rl2 = r2.^(2*l);
sines = zeros(N, length(r));
V = zeros(N, N);
for i = 1:N;
sines(i, :) = sin(i*pi*r/R0);
end
x1 = length(r1);
x2 = length(r);
for nn = 1:N
for mm = 1:nn
f1 = (1/6)*rl1.*r1.^2.*sines(nn, 1:x1).*sines(mm, 1:x1);
f2 = ((R^2/2)*rl2 - (R^3/3)*rl2.*r2.^(-1)).*sines(nn, x1+1:x2).*sines(mm, x1+1:x2);
value = 4*pi*c*x*trapz([f1 f2]);
V(nn, mm) = value;
V(mm, nn) = value;
end
end
I figured that calling sin(x) in the loop was a bad idea, so I calculate all the needed values and store them. To evaluate the integrals I used trapz, but as the first and the second/third integrals have different ranges the function values need to be calculated separately and then combined.
I've tried a couple different ways of vectorization but the only one that gives the correct results takes much longer than the above loop (used gmultiply but the arrays created are enourmous). I've also made an analytical solution (which is possible assuming m and n are integers and R0 > R > 0) but these solutions involve a cosine integral (cosint in MATLAB) function which is extremely slow for large N.
I'm not sure the entire thing can be vectorized without creating very large arrays, but the inner loop at least should be possible. Any ideas would be be greatly appreciated!
The inputs I use currently are:
R0 = 1000;
R = 8.4691;
c = 0.393*10^(-2);
x = 0.01;
l = 0 # Can reasonably be 0-6;
N = 20; # Increasing the value will give the same results,
# but I would like to be able to do at least N = 600;
Using these values
V(1, 1:3) = 873,379900963549 -5,80688363271849 -3,38139152472590
Although the diagonal values never converge with increasing R0 so they are less interesting.
You will lose the gain from the symmetricity of the problem with my approach, but this means a factor of 2 loss. Odds are that you'll still benefit in the end.
The idea is to use multidimensional arrays, making use of trapz supporting these inputs. I'll demonstrate the first term in your figure, as the two others should be done similarly, and the point is the technique:
r1 = 0.01:x:R;
r2 = R:x:R0;
r = [r1 r2].';
rl1 = r1.'.^(2*l);
rl2 = r2.'.^(2*l);
sines = zeros(length(r),N); %// CHANGED!!
%// V = zeros(N, N); not needed now, see later
%// you can define sines in a vectorized way as well:
sines = sin(r*(1:N)*pi/R0); %//' now size [Nr, N] !
%// note that implicitly r is of size [Nr, 1, 1]
%// and sines is of size [Nr, N, 1]
sines2mat = permute(sines,[1, 3, 2]); %// size [Nr, 1, N]
%// the first term in V: perform integral along first dimension
%//V1 = 1/6*squeeze(trapz(bsxfun(#times,bsxfun(#times,r.^(2*l+2),sines),sines2mat),1))*x; %// 4*pi*c prefactor might be physics, not math
V1 = 1/6*permute(trapz(bsxfun(#times,bsxfun(#times,r.^(2*l+2),sines),sines2mat),1),[2,3,1])*x; %// 4*pi*c prefactor might be physics, not math
The key point is that bsxfun(#times,r.^(2*l+2),sines) is a matrix of size [Nr,N,1], which is again multiplied by sines2mat using bsxfun, the result is of size [Nr,N,N] and an element (k1,k2,k3) corresponds to an integrand at radial point k1, n=k2 and m=k3. Using trapz() with explicitly the first dimension (which would be default) reduces this to an array of size [1,N,N], which is just what you need after a good squeeze(). Update: as per #Dev-iL's comment you should use permute instead of squeeze to get rid of the leading singleton dimension, as that might be more efficent.
The two other terms can be handled the same way, and of course it might still help if you restructure the integrals based on overlapping and non-overlapping parts.
The result of solving a polynom equation is a 1x2 vector or a 1x1 in some instances. I am trying to store all solutions for equations with different coefficients. so some solutions are just 1x1 vectors. how can i store these efficiently?
n = 1;
%sol = zeros(size(coef)); %create solution matrix in memory
sol = {};
while n < size(coef,2)
sol(n) = roots(coef(:,n));
end
"Conversion to cell from double is not possible."
error.
coef is coefficient matrix
You're almost there!
In order to store the vectors as cells in the cell array, use curly braces {} during their assignment:
sol(n) = {roots(coef(:,n))};
or alternatively:
sol{n} = roots(coef(:,n));
That way, the vectors/arrays can be of any size. Check this link for more info about accessing data in cell arrays.
Also, don't forget to increment n otherwise you will get an infinite loop.
Whole code:
n = 1;
%sol = zeros(size(coef)); %create solution matrix in memory
sol = {};
while n <= size(coef,2)
sol(n) = {roots(coef(:,n))};
n = n+1
end