I am using parfor for parallel computing in Matlab. I am not familiar with this command. If that is possible, please look at my code below and tell me if I can write it with parfor. The error :
The parfor loop cannot be run due to the way variable pyra is used.
parfor i = 1:inter
scaled = resize(im, 1/sc^(i-1));
pyra.feat{i} = descripteurs(scaled,class);
pyra.scale(i) = 1/sc^(i-1);
for j = i+inter:inter:max_scale
scaled = reduce(scaled);
pyra.feat{j} = descripteurs(scaled,class);
pyra.scale(j) = 0.6 * pyra.scale(j-inter);
end
end
The issue is that your code is not parallelizable since each iteration through the parfor loop depends on other iterations of the loop.
Specifically, you're trying to access values of pyra.scale from within the inner loop that were computed on previous iterations through the outer loop. Because of this, the execution of each iteration of the parfor loop is dependent upon the previous iteration and therefore the two iterations cannot be executed in parallel (at the same time).
See more about the use of variables in parfor loops in the documentation.
Related
I am trying to use a for loop inside of a parfor loop in Matlab.
The for loop is equivalent to the ballode example in here.
Inside the for loop a function ballBouncing is called which is a system of 6 differential equations.
So, what I am trying to do is to use 500 different sets of parameter values for the ODE system and run it, but for each parameter set, a sudden impulse is added, which is handled through the code in 'for' loop.
However, I don't understand how to implement this using a parfor and a for loop as below.
I could run this code by using two for loops but when the outer loop is made to be a parfor it gives the errors,
the PARFOR loop cannot run due to the way variable results is used,
the PARFOR loop cannot run due to the way variable y0 is used and
Valid indices for results are restricted in PARFOR loops
results=NaN(500,100);
x=rand(500,10);
parfor j=1:500
bouncingTimes=[10,50];%at time 10 a sudden impulse is added
refine=2;
tout=0;
yout=y0;%initial conditions of ODE system
paras=x(j,:);%parameter values for the ODE
for i=1:2
tfinal=bouncingTimes(i);
[t,y]=ode45(#(t,y)ballBouncing(t,y,paras),tstart:1:tfinal,y0,options);
nt=length(t);
tout=[tout;t(2:nt)];
yout=[yout;y(2:nt,:)];
y0(1:5)=y(nt,1:5);%updating initial conditions with the impulse
y0(6)=y(nt,6)+paras(j,10);
options = odeset(options,'InitialStep',t(nt)-t(nt-refine),...
'MaxStep',t(nt)-t(1));
tstart =t(nt);
end
numRows=length(yout(:,1));
results(1:numRows,j)=yout(:,1);
end
results;
Can someone help me to implement this using a parfor outer loop.
Fixing the assignment into results is relatively straightforward - what you need to do is ensure you always assign a whole column. Here's how I would do that:
% We will always need the full size of results in dimension 1
numRows = size(results, 1);
parfor j = ...
yout = ...; % variable size
yout(end:numRows, :) = NaN; % Expand if necessary
results(:, j) = yout(1:numRows, 1); % Shrink 'yout' if necessary
end
However, y0 is harder to deal with - the iterations of your parfor loop are not order-independent because of the way you're passing information from one iteration to the next. parfor can only handle loops where the iterations are order-independent.
I have a problem with MathWorks Parallel Computing Toolbox in Matlab. See my code below
for k=1:length(Xab)
n1=length(Z)*(k-1)+1:length(Z)*k;
MX_j(1,n1)=JXab{k};
MY_j(1,n1)=JYab{k};
MZ_j(1,n1)=Z;
end
for k=length(Xab)+1:length(Xab)+length(Xbc)
n2=length(Z)*(k-1)+1:length(Z)*k;
MX_j(1,n2)=JXbc{k-length(Xab)};
MY_j(1,n2)=JYbc{k-length(Yab)};
MZ_j(1,n2)=Z;
end
for k=length(Xab)+length(Xbc)+1:length(Xab)+length(Xbc)+length(Xcd)
n3=length(Z)*(k-1)+1:length(Z)*k;
MX_j(1,n3)=JXcd{k-length(Xab)-length(Xbc)};
MY_j(1,n3)=JYcd{k-length(Yab)-length(Ybc)};
MZ_j(1,n3)=Z;
end
for k=length(Xab)+length(Xbc)+length(Xcd)+1:length(Xab)+length(Xbc)+length(Xcd)+length(Xda)
n4=length(Z)*(k-1)+1:length(Z)*k;
MX_j(1,n4)=JXda{k-length(Xab)-length(Xbc)-length(Xcd)};
MY_j(1,n4)=JYda{k-length(Yab)-length(Ybc)-length(Ycd)};
MZ_j(1,n4)=Z;
end
If I change the for-loop to parfor-loop, matlab warns me that MX_j is not an efficient variable. I have no idea how to solve this and how to make these for loops compute in parallel?
For me, it looks like you can combine it to one loop. Create combined cell arrays.
JX = cat(2,JXab, JXbc, JXcd, JXda);
JY = cat(2,JYab, JYbc, JYcd, JYda);
Check for the right dimension here. If your JXcc arrays are column arrays, use cat(1,....
After doing that, one single loop should do it:
n = length(Xab)+length(Xbc)+length(Xcd)+length(Xda);
for k=1:n
k2 = length(Z)*(k-1)+1:length(Z)*k;
MX_j(1,k2)=JX{k};
MY_j(1,k2)=JY{k};
MZ_j(1,k2)=Z;
end
Before parallizing anything, check if this still valid. I haven't tested it. If everything's nice, you can switch to parfor.
When using parfor, the arrays must be preallocated. The following code could work (untested due to lack of test-data):
n = length(Xab)+length(Xbc)+length(Xcd)+length(Xda);
MX_j = zeros(1,n*length(Z));
MY_j = MX_j;
MZ_j = MX_j;
parfor k=1:n
k2 = length(Z)*(k-1)+1:length(Z)*k;
MX_j(1,k2)=JX{k};
MY_j(1,k2)=JY{k};
MZ_j(1,k2)=Z;
end
Note: As far as I can see, the parfor loop will be much slower here. You simply assign some values... no calculation at all. The setup of the worker pool will take 99.9% of the total execution time.
The original code is like this:
for i = 1 : size(H, 1)
for j = 1 : size(H, 2)
H{i,j} blabla
and I tried to adapt it into parallel code like this:
parfor ind = 1 : numel(H)
[i, j] = ind2sub(ind);
H{i,j} blabla
which generates an error saying parfor cannot run due to H{i,j}.
Then what's the error here? And how can I adapt the nested loop into parfor?
One possible solution is
for i = 1 : size(H, 1)
parfor j = 1 : size(H, 2)
H{i,j} blabla
But I doubt using a parfor within another loop will multiply the overhead of parfor which results in additional computation time.
I think the error for using parfor is that Matlab is unable to detect that [i,j] is unique through the loop because it is the result of a function. Thus, for the engine, you may access to H{i,j} multiple times, iterations are not analyzed to be independent from each other.
Edit: as mentioned by patrik, you have to be sure that there is no dependence between two iterations, that is here H{i,j} does not depend on H{k,l}, i!=k and j!=l, nor the value of a variable in the iteration is used in another iteration. This requirement is the basic one to allow a parfor, except from reduction assignment.
Besides that point, if you want to run independent computations in parallel, and if it worth it, always choose to parfor the outermost loop. In addition to this, remind that Matlab does not allow nested parfor; instead, you have to make a function which runs a parfor if you want to parallelize inner for-loops. The parallelization of inner loops may not bring a speed-up (depends on how many workers are there in the parpool).
From my experience, it is not recommended to run parallel inner loops. As an example (outside Matlab), I would cite LibSVM, which recommends to parallelize only the outermost loop with openmp if you want to speed-up the computation, never other inner loops.
The reason of this recommendation is that you have a limited pool of workers, and workers may be viewed as threads; there is a limit where if you add threads, the computation run slower because of the time of switching between threads. Matlab may manage this part very well, but the point is that you will have a pool of workers limited in size. If each outermost iteration takes a lot of time and if you have many iteration, you will gain no time to parallelize inner loops because each worker will be busy to run the whole iteration (including inner loops).
Nevertheless, it's always a good thing to test each option, some of them may be counter-intuitively more adapted to your problem!
Why not simply use the linear index to assign into H? For example:
H = cell(4, 4);
parfor idx = 1:16
[i, j] = ind2sub([4, 4], idx);
H{idx} = rand(i, j); % or whatever
end
Otherwise, it's always best to make the outermost loop the PARFOR loop. The following also works:
H = cell(4, 4);
parfor r = 1:4
for c = 1:4
H{r, c} = rand(r, c);
end
end
I want to parallelize block2 for each block1 and parallerlize outer loop too.
previous code:
for i=rangei
<block1>
for j=rangej
<block2> dependent on <block1>
end
end
changed code:
parfor i=rangei
<block1>
parfor j=rangej
<block2> dependent on <block1>
end
end
how much efficient can this get and will the changed code do the right thing?
Is the changed code valid for my requirements?
In MATLAB, parfor cannot be nested. Which means, in your code, you should replace one parfor by a for (the outer loop most likely). More generally, I advise you to look at this tutorial on parfor.
parfor cannot be nested. In nested parfor statements, only the outermost call to parfor is paralellized, which means that the inner call to parfor only adds unnecessary overhead.
To get high efficiency with parfor, the number of iterations should be much higher than the number of workers (or an exact multiple in case each iteration takes the same time), and you want a single iteration to take more than just a few milliseconds to avoid feeling the overhead from paralellization.
parfor i=rangei
<block1>
for j=rangej
<block2> dependent on <block1>
end
end
may actually fit that description, depending on the size of rangei. Alternatively, you may want to try unrolling the nested loop into a single loop, where you iterate across linear indices.
The following code uses a single parfor loop to implicitly manage two nested loops. The loop1_index and loop2_index are the ranges, and the loop1_counter and loop2_counter are the actual loop iterators. Also, the iterators are put in reverse order in order to have a better load balance, because usually the load of higher range values is bigger than those of smaller values.
loop1_index = [1:5]
loop2_index = [1:4]
parfor temp_label_parfor = 1 : numel(loop1_index) * numel(loop2_index)
[loop1_counter, loop2_counter] = ind2sub([numel(loop1_index), numel(loop2_index)], temp_label_parfor)
loop1_counter = numel(loop1_index) - loop1_counter + 1;
loop2_counter = numel(loop2_index) - loop2_counter + 1;
end
You can't use nested parfor, From your question it seems that you are working on a matrix( with parameter i,j),
try using blockproc, go through this link once blockproc
Why can't I use the parfor in this piece of code?
parfor i=1:r
for j=1:N/r
xr(j + (N/r) * (i-1)) = x(i + r * (j-1));
end
end
This is the error:
Error: The variable xr in a parfor cannot be classified.
See Parallel for Loops in MATLAB, "Overview".
The issue here is that of improper indexing of the sliced array. parfor loops are run asynchronously, meaning the order in which each iteration is executed is random. From the documentation:
MATLAB workers evaluate iterations in no particular order, and independently of each other. Because each iteration is independent, there is no guarantee that the iterations are synchronized in any way, nor is there any need for this.
You can easily verify the above statement by typing the following in the command line:
parfor i=1:100
i
end
You'll see that the ordering is arbitrary. Hence if you split a parallel job between different workers, one worker has no way of telling if a different iteration has finished or not. Hence, your variable indexing cannot depend on past/future values of the iterator.
Let me demonstrate this with a simple example. Consider the Fibonacci series 1,1,2,3,5,8,.... You can generate the first 10 terms of the series easily (in a naïve for loop) as:
f=zeros(1,10);
f(1:2)=1;
for i=3:10
f(i)=f(i-1)+f(i-2);
end
Now let's do the same with a parfor loop.
f=zeros(1,10);
f(1:2)=1;
parfor i=3:10
f(i)=f(i-1)+f(i-2);
end
??? Error: The variable f in a parfor cannot be classified.
See Parallel for Loops in MATLAB, "Overview"
But why does this give an error?
I've shown that iterations are executed in an arbitrary order. So let's say that a worker gets the loop index i=7 and the expression f(i)=f(i-1)+f(i-2);. It is now supposed to execute the expression and return the results to the master node. Now has iteration i=6 finished? Is the value stored in f(6) reliable? What about f(5)? Do you see what I'm getting at? Supposing f(5) and f(6) are not done, then you'll incorrectly calculate that the 7th term in the Fibonnaci series is 0!
Since MATLAB has no way of telling if your calculation can be guaranteed to run correctly and reproduce the same result each time, such ambiguous assignments are explicitly disallowed.