vectorizing a nested loop where one loop variable depends on the other - matlab

I've recently learned how to vectorize a "simple" nested loop in a previous question I've asked. However, now I'm trying also to vectorize the following loop
A=rand(80,80,10,6,8,8);
I=rand(size(A1,3),1);
C=rand(size(A1,4),1);
B=rand(size(A1,5),1);
for i=1:numel(I)
for v=1:numel(C)
for j=1:numel(B)
for k=1:j
A(:,:,i,v,j,k)= A(:,:,i,v,j,k)*I(i)*C(v)*B(j)*((k-1>0)+1);
end
end
end
end
So now k depends in j... What have I tried so far:
The combination of j and k terms (i.e. B(j)*((k-1>0)+1) gives a triangular matrix that I manage to vectorize independently:
B2=tril([ones(8,1)*B']');
B2(2:end,2:end)=2*B2(2:end,2:end);
But that gives me the (j,k) matrix properly and not a way to use it to vectorize the remaining loop. Maybe I'm in the wrong path too... So how can I vectorize that type of loop?

In one of your comments to the accepted solution of the previous question, you mentioned that successive bsxfun(#times,..,permute..) based codes were faster. If that's the case, you can use a similar approach here as well. Here's the code that uses such a pattern alongwith tril -
B1 = tril(bsxfun(#times,B,[1 ones(1,numel(B)-1).*2]));
v1 = bsxfun(#times,B1, permute(C,[3 2 1]));
v2 = bsxfun(#times,v1, permute(I,[4 3 2 1]));
A = bsxfun(#times,A, permute(v2,[5 6 4 3 1 2]));

You were close. The vectorization you proposed indeed follows the (j,k) logic, but doing tril adds zeros in places where the loop doesn't go into. Using the solution for your previous question (#david's) is not complete as it multiplies all elements including these zero value elements that the loop doesn't go into. My solution to that, is to find these zero elements and replace them with 1 (so simple):
Starting with your code:
B2=tril([ones(8,1)*B']');
B2(2:end,2:end)=2*B2(2:end,2:end);
and following the vectorization shown in the previous question:
s=size(A);
[b,c,d]=ndgrid(I,C,B2);
F=b.*c.*d;
F(F==0)=1; % this is the step that is important for your case.
A=reshape(A,s(1),s(2),[]);
A=bsxfun(#times,A,permute(F(:),[3 2 1]));
A=reshape(A,s);
For the size of A used in the question this cuts about 50% of the run time, not bad...

Related

Matlab: need some help for a seemingly simple vectorization of an operation

I would like to optimize this piece of Matlab code but so far I have failed. I have tried different combinations of repmat and sums and cumsums, but all my attempts seem to not give the correct result. I would appreciate some expert guidance on this tough problem.
S=1000; T=10;
X=rand(T,S),
X=sort(X,1,'ascend');
Result=zeros(S,1);
for c=1:T-1
for cc=c+1:T
d=(X(cc,:)-X(c,:))-(cc-c)/T;
Result=Result+abs(d');
end
end
Basically I create 1000 vectors of 10 random numbers, and for each vector I calculate for each pair of values (say the mth and the nth) the difference between them, minus the difference (n-m). I sum over of possible pairs and I return the result for every vector.
I hope this explanation is clear,
Thanks a lot in advance.
It is at least easy to vectorize your inner loop:
Result=zeros(S,1);
for c=1:T-1
d=(X(c+1:T,:)-X(c,:))-((c+1:T)'-c)./T;
Result=Result+sum(abs(d),1)';
end
Here, I'm using the new automatic singleton expansion. If you have an older version of MATLAB you'll need to use bsxfun for two of the subtraction operations. For example, X(c+1:T,:)-X(c,:) is the same as bsxfun(#minus,X(c+1:T,:),X(c,:)).
What is happening in the bit of code is that instead of looping cc=c+1:T, we take all of those indices at once. So I simply replaced cc for c+1:T. d is then a matrix with multiple rows (9 in the first iteration, and one fewer in each subsequent iteration).
Surprisingly, this is slower than the double loop, and similar in speed to Jodag's answer.
Next, we can try to improve indexing. Note that the code above extracts data row-wise from the matrix. MATLAB stores data column-wise. So it's more efficient to extract a column than a row from a matrix. Let's transpose X:
X=X';
Result=zeros(S,1);
for c=1:T-1
d=(X(:,c+1:T)-X(:,c))-((c+1:T)-c)./T;
Result=Result+sum(abs(d),2);
end
This is more than twice as fast as the code that indexes row-wise.
But of course the same trick can be applied to the code in the question, speeding it up by about 50%:
X=X';
Result=zeros(S,1);
for c=1:T-1
for cc=c+1:T
d=(X(:,cc)-X(:,c))-(cc-c)/T;
Result=Result+abs(d);
end
end
My takeaway message from this exercise is that MATLAB's JIT compiler has improved things a lot. Back in the day any sort of loop would halt code to a grind. Today it's not necessarily the worst approach, especially if all you do is use built-in functions.
The nchoosek(v,k) function generates all combinations of the elements in v taken k at a time. We can use this to generate all possible pairs of indicies then use this to vectorize the loops. It appears that in this case the vectorization doesn't actually improve performance (at least on my machine with 2017a). Maybe someone will come up with a more efficient approach.
idx = nchoosek(1:T,2);
d = bsxfun(#minus,(X(idx(:,2),:) - X(idx(:,1),:)), (idx(:,2)-idx(:,1))/T);
Result = sum(abs(d),1)';
Update: here are the results for the running times for the different proposals (10^5 trials):
So it looks like the transformation of the matrix is the most efficient intervention, and my original double-loop implementation is, amazingly, the best compared to the vectorized versions. However, in my hands (2017a) the improvement is only 16.6% compared to the original using the mean (18.2% using the median).
Maybe there is still room for improvement?

Matlab - Eliminating for loops

So, I am aware that there are a number of other posts about eliminating for loops but I still haven't been able to figure this out.
I am looking to rewrite my code so that it has fewer for loops and runs a little faster. The code describes an optics problem calculating the intensity of different colors after the light has propagated through a medium. I have already gotten credit for this assignment but I would like to learn of better ways than just throwing in for loops all over the place. I tried rewriting the innermost loop using recursion which worked and looked nice but was a little slower.
Any other comments/improvements are also welcome.
Thanks!
n_o=1.50;
n_eo=1.60;
d=20e-6;
N_skiv=100;
lambda=[650e-9 510e-9 475e-9];
E_in=[1;1]./sqrt(2);
alfa=pi/2/N_skiv;
delta=d/N_skiv;
points=100;
int=linspace(0,pi/2,points);
I_ut=zeros(3,points);
n_eo_theta=#(theta)n_eo*n_o/sqrt(n_o^2*cos(theta)^2+n_eo^2*sin(theta)^2);
hold on
for i=1:3
for j=1:points
J_last=J_pol2(0);
theta=int(j);
for n=0:N_skiv
alfa_n=alfa*n;
J_last=J_ret_uppg2(alfa_n, delta , n_eo_theta(theta) , n_o , lambda(i) ) * J_last;
end
E_ut=J_pol2(pi/2)*J_last*E_in;
I_ut(i,j)=norm(E_ut)^2;
end
end
theta_grad=linspace(0,90,points);
plot(theta_grad,I_ut(1,:),'r')
plot(theta_grad,I_ut(2,:),'g')
plot(theta_grad,I_ut(3,:),'b')
And the functions:
function matris=J_proj(alfa)
matris(1,1)=cos(alfa);
matris(1,2)=sin(alfa);
matris(2,1)=-sin(alfa);
matris(2,2)=cos(alfa);
end
function matris=J_pol2(alfa)
J_p0=[1 0;0 0];
matris=J_proj(-alfa)*J_p0*J_proj(alfa);
end
function matris=J_ret_uppg2(alfa_n,delta,n_eo_theta,n_o,lambda)
k0=2*pi/lambda;
J_r0_u2(1,1)=exp(1i*k0*delta*n_eo_theta);
J_r0_u2(2,2)=exp(1i*k0*n_o*delta);
matris=J_proj(-alfa_n)*J_r0_u2*J_proj(alfa_n);
end
Typically you cannot get rid of a for-loop if you are doing a calculation that depends on a previous answer, which seems to be the case with the J_last-variable.
However I saw at least one possible improvement with the n_eo_theta inline-function, instead of doing that calculation 100 times, you could instead simply change this line:
n_eo_theta=#(theta)n_eo*n_o/sqrt(n_o^2*cos(theta)^2+n_eo^2*sin(theta)^2);
into:
theta_0 = 1:100;
n_eo_theta=n_eo*n_o./sqrt(n_o^2*cos(theta_0).^2+n_eo^2*sin(theta_0).^2);
This would run as is, although you should also want to remove the variable "theta" in the for-loop. I.e. simply change
n_eo_theta(theta)
into
n_eo_theta(j)
The way of using the "." prefix in the calculations is the furthermost tool for getting rid of for-loops (i.e. using element-wise calculations). For instance; see element-wise multiplication.
You can use matrices!!!!
For example, you have the statement:
theta=int(j)
which is inside a nested loop. You can replace it by:
theta = [int(1:points);int(1:points);int(1:points)];
or:
theta = int(repmat((1:points), 3, 1));
Then, you have
alfa_n=alfa * n;
you can replace it by:
alfa_n = alfa .* (0:N_skiv);
And have all the calculation done in a row like fashion. That means, instead looping, you will have the values of a loop in a row. Thus, you perform the calculations at the rows using the MATLAB's functionalities and not looping.

How to nest multiple parfor loops

parfor is a convenient way to distribute independent iterations of intensive computations among several "workers". One meaningful restriction is that parfor-loops cannot be nested, and invariably, that is the answer to similar questions like there and there.
Why parallelization across loop boundaries is so desirable
Consider the following piece of code where iterations take a highly variable amount of time on a machine that allows 4 workers. Both loops iterate over 6 values, clearly hard to share among 4.
for row = 1:6
parfor col = 1:6
somefun(row, col);
end
end
It seems like a good idea to choose the inner loop for parfor because individual calls to somefun are more variable than iterations of the outer loop. But what if the run time for each call to somefun is very similar? What if there are trends in run time and we have three nested loops? These questions come up regularly, and people go to extremes.
Pattern needed for combining loops
Ideally, somefun is run for all pairs of row and col, and workers should get busy irrespectively of which iterand is being varied. The solution should look like
parfor p = allpairs(1:6, 1:6)
somefun(p(1), p(2));
end
Unfortunately, even if I knew which builtin function creates a matrix with all combinations of row and col, MATLAB would complain with an error The range of a parfor statement must be a row vector. Yet, for would not complain and nicely iterate over columns. An easy workaround would be to create that matrix and then index it with parfor:
p = allpairs(1:6, 1:6);
parfor k = 1:size(pairs, 2)
row = p(k, 1);
col = p(k, 2);
somefun(row, col);
end
What is the builtin function in place of allpairs that I am looking for? Is there a convenient idiomatic pattern that someone has come up with?
MrAzzman already pointed out how to linearise nested loops. Here is a general solution to linearise n nested loops.
1) Assuming you have a simple nested loop structure like this:
%dummy function for demonstration purposes
f=#(a,b,c)([a,b,c]);
%three loops
X=cell(4,5,6);
for a=1:size(X,1);
for b=1:size(X,2);
for c=1:size(X,3);
X{a,b,c}=f(a,b,c);
end
end
end
2) Basic linearisation using a for loop:
%linearized conventional loop
X=cell(4,5,6);
iterations=size(X);
for ix=1:prod(iterations)
[a,b,c]=ind2sub(iterations,ix);
X{a,b,c}=f(a,b,c);
end
3) Linearisation using a parfor loop.
%linearized parfor loop
X=cell(4,5,6);
iterations=size(X);
parfor ix=1:prod(iterations)
[a,b,c]=ind2sub(iterations,ix);
X{ix}=f(a,b,c);
end
4) Using the second version with a conventional for loop, the order in which the iterations are executed is altered. If anything relies on this you have to reverse the order of the indices.
%linearized conventional loop
X=cell(4,5,6);
iterations=fliplr(size(X));
for ix=1:prod(iterations)
[c,b,a]=ind2sub(iterations,ix);
X{a,b,c}=f(a,b,c);
end
Reversing the order when using a parfor loop is irrelevant. You can not rely on the order of execution at all. If you think it makes a difference, you can not use parfor.
You should be able to do this with bsxfun. I believe that bsxfun will parallelise code where possible (see here for more information), in which case you should be able to do the following:
bsxfun(#somefun,(1:6)',1:6);
You would probably want to benchmark this though.
Alternatively, you could do something like the following:
function parfor_allpairs(fun, num_rows, num_cols)
parfor i=1:(num_rows*num_cols)
fun(mod(i-1,num_rows)+1,floor(i/num_cols)+1);
end
then call with:
parfor_allpairs(#somefun,6,6);
Based on the answers from #DanielR and #MrAzzaman, I am posting two functions, iterlin and iterget in place of prod and ind2sub that allow iteration over ranges also if those do not start from one. An example for the pattern becomes
rng = [1, 4; 2, 7; 3, 10];
parfor k = iterlin(rng)
[plate, row, col] = iterget(rng, k);
% time-consuming computations here %
end
The script will process the wells in rows 2 to 7 and columns 3 to 10 on plates 1 to 4 without any workers idling while more wells are waiting to be processed. In hope that this helps someone, I deposited iterlin and iterget at the MATLAB File Exchange.

Select all elements except one in a vector

My question is very similar to this one but I can't manage exactly how to apply that answer to my problem.
I am looping through a vector with a variable k and want to select the whole vector except the single value at index k.
Any idea?
for k = 1:length(vector)
newVector = vector( exluding index k); <---- what mask should I use?
% other operations to do with the newVector
end
Another alternative without setdiff() is
vector(1:end ~= k)
vector([1:k-1 k+1:end]) will do. Depending on the other operations, there may be a better way to handle this, though.
For completeness, if you want to remove one element, you do not need to go the vector = vector([1:k-1 k+1:end]) route, you can use vector(k)=[];
Just for fun, here's an interesting way with setdiff:
vector(setdiff(1:end,k))
What's interesting about this, besides the use of setdiff, you ask? Look at the placement of end. MATLAB's end keyword translates to the last index of vector in this context, even as an argument to a function call rather than directly used with paren (vector's () operator). No need to use numel(vector). Put another way,
>> vector=1:10;
>> k=6;
>> vector(setdiff(1:end,k))
ans =
1 2 3 4 5 7 8 9 10
>> setdiff(1:end,k)
Error using setdiff (line 81)
Not enough input arguments.
That is not completely obvious IMO, but it can come in handy in many situations, so I thought I would point this out.
Very easy:
newVector = vector([1:k-1 k+1:end]);
This works even if k is the first or last element.
%create a logic vector of same size:
l=ones(size(vector))==1;
l(k)=false;
vector(l);
Another way you can do this which allows you to exclude multiple indices at once (or a single index... basically it's robust to allow either) is:
newVector = oldVector(~ismember(1:end,k))
Works just like setdiff really, but builds a logical mask instead of a list of explicit indices.

Replace certain elements of matrix with NaN (MATLAB)

I have a vector, A.
A=[3,4,5,2,2,4;2,3,4,5,3,4;2,4,3,2,3,1;1,2,3,2,3,4]
Some of the records in A must be replaced by NaN values, as they are inaccurate.
I have created vector rowid, which records the last value that must be kept after which the existing values must be swapped to NaN.
rowid=[4,5,4,3]
So the vector I wish to create, B, would look as follows:
B=[3,4,5,2,NaN,NaN;2,3,4,5,3,NaN;2,4,3,2,NaN,NaN;1,2,3,NaN,NaN,NaN]
I am at a loss as to how to do this. I have tried to use
A(:,rowid:end)
to start selecting out the data from vector A. I am expecting to be able to use sub2ind or some sort of idx to do this, possibly an if loop, but I don't know where to start and cannot find an appropriate similar question to base my thoughts on!
I would very much appreciate any tips/pointers, many thanks
If you are not yet an expert of matlab, I would stick to simple for-loops for now:
B = A;
for i=1:length(rowid)
B(i, rowid(i)+1:end) = NaN;
end
It is always a sport to write this as a one-liner (see Mohsen's answer), but in many cases an explicit for-loop is much clearer.
A compact one is:
B = A;
B(bsxfun(#lt, rowid.', 1:size(A,2)))=NaN;