We're developing an application which is processing medical images of an eye retina.
Very often the straight iterating through the pixel indices is used. And even when the size of images is fixed to 1024*768 pixels it may be a CPU-consuming operation e.g. to assign certain values to binarized pixels we need.
lowlayers2 = zeros(img_y_size, img_x_size);
for i=1:numel(lowlayers)
y = rem(lowlayers(i),img_y_size);
x = fix(lowlayers(i)/img_y_size)+1;
lowlayers2(y,x) = 1;
end;
When trying to use parfor in a simple loop above the debugger types that all variables in the loop must be presented as sliced ones. I guess it's in order to divide iterations more primitively inside the loop.
How can I modify the loop or variable to be able to use parfor? May every variable be presented as a sliced variable (mean more multidim matrix with 2 or 3 dimentions)?
a sliced variable is a variable that has a reference out of parfor loop and each of its element only accessed by a single worker (in parfor paralle workers)
sometime matlab doesnt recognize a variable in parfor loop as "sliced variable"
so you could use a temporary variable and collect results after the parfor loop,
lowlayers2 = zeros(img_y_size, img_x_size);
parfor i=1:numel(lowlayers)
y = rem(lowlayers(i),img_y_size);
x = fix(lowlayers(i)/img_y_size)+1;
t(i)= sub2ind(size(lowlayers2),y,x);
end
lowlayers2(t)=1;
NOTE 1: It is better to vectorise code in the older versions because loops didn't use to be as good as they are now in R2017 referring to (this).
Related
Just beginning with parallel stuff...
I have a code that boils down to filling the columns of a matrix A (which is preallocated with NaNs) with an array of variable length:
A = nan(100);
for ii=1:100
hmy = randi([1,100]); %lenght of the array
A(1:hmy,ii) = rand(hmy,1); %array
end
Simply transforming the for in a parfor does not even run
parfor ii=1:100
hmy = randi([1,100]); %how many not NaN values to put in
A(1:hmy,ii) = rand(hmy,1);
end
because the parfor does not like the indexing:
MATLAB runs loops in parfor functions by dividing the loop iterations
into groups, and then sending them to MATLAB workers where they run in
parallel. For MATLAB to do this in a repeatable, reliable manner, it
must be able to classify all the variables used in the loop. The code
uses the indicated variable in a way that is incompatible with
classification.
I thought that this was due to the indexing on the first dimension and tried a workaround that did not work (same error message as before):
parfor ii=1:100
hmy = randi([1,100]);
tmp = [rand(hmy,1); NaN(size(A,1)-hmy,1)];
A(:,ii) = tmp;
end
How can I index A in order to store the array?
You cant partially change the row or column data in A. You have to do either a full row or full column inside parfor. Here is the updated code.
A = nan(100);
parfor ii=1:100
hmy = randi([1,100]); %lenght of the array
temp = nan(1,100);
temp(1:hmy) = rand(hmy,1); %array
A(:,ii) = temp; %updating full row of iith column
end
First of all, the output variable (A in this case) will be a sliced variable in case of parfor. Which means this variable will be sliced in different part for parallel calculation. The form of indexing in the sliced variable should be same for all incident. Here, you are creating a random number(hmy) and using (1:hmy) as a index which is changing every now and then. That's why your output variable can't be sliced and you have the error.
If you try with a fixed hmy, then it will be alright.
I have a question about parallelizing code in MATLAB. I use MATLAB 2017a.
Let's say I have a cell array:
A = { A1, ..., A10}
and these matrices are quite big ( size > 10000 ). Now I want to start manipulating these matrices in a parallelpool. In fact, ther first worker needs only A1, the second worker needs only A2 and so on.
I have now this code;
parfor i = 1:10
matrix = A{i};
blabla = manipulate(Ai);
save(blabla);
end
I think that MATLAB gives every worker all the matrices in A but this is not really needed. Is there a way to say:
"Give i-th worker only matrix Ai"?
Based on the documentation for variables in parfor loops, specifically sliced variables, it appears as though the cell array A in your example meets the criteria to be treated as a sliced variable by default. You shouldn't have to do anything special. You may want to confirm that all the listed criteria are met, and take a look at each variable to see how they are used both inside and outside your parfor loop.
You want spmd blocks. That way, you explicitly handle the slicing of the parallel data, rather than letting Matlab do it automagically with the parfor block.
parpool('myprofile',10)
spmd
i = labindex;
B = foo(A{i});
end
for i = 1:10
bar(B{i});
end
I would like to run a Simulink model in a parfor loop on many cores with different data. However, I could not get the sim results when I use parfor whereas I can obtain them while using only for loop.
It simply get [t,u] from workspace1, consider a transfer function n{1}/d{1} and then calculates the EqFracInt to workspace2.
The problematic part of my code is
...
parfor ieq=1:1
assignin('base','t',t);
assignin('base','u',u);
assignin('base','n',n);
assignin('base','d',d);
assignin('base','T_end',T_end);
[simout] = sim('RespSpecFrac', [0 T_end], simset('ReturnWorkspaceOutputs','on'));
PGRs = simout.get('EqFracInt');
end
I could not get the PGRs values. Could you please explain the error to me and how to solve it?
The cause you can't get PGRs values is that is a temporary variable. Since in a parfor loop each iteration is independent, any variable assigned directly inside the loop without indexing is used only in that specific iteration and destroyed.
In order to get the desired values, you need to transform PGRs in either a sliced variable (by indexing it) or in a reduction variable (by, for example, concatenating it). Try to modify the last line of your loop following one of these examples:
PGRs(ieq, :) = simout.get('EqFracInt'); % sliced variable
or
PGRs = [PGRs; simout.get('EqFracInt')]; % reduction variable
The specific implementation will depend on the shape of the expected outputs EqFracInt, of course. You can see more about types of variables inside a parfor loop in the documentation.
I am implementing an Sum of Square Distances based disparity Map function in Matlab for computer vision. Currently the code has a nested for loop and runs very slow. Any suggestions on vectorizing it to make it more efficient? Thanks
%im1 and im2 are images and win1, win2 are window sizes
for i=win1+1:1:bottom-win1
parfor j=win2+1:1:right-win2
%j=[win2+1:bottom-win2];
template=im1(i-win1:i+win1,j-win2:j+win2);
arg1=conv2(im2.^2,ones(size(template))/2,'same');
arg2=conv2(im2,rot90(template,2),'same');
arg=arg1-arg2;
[xj]=find(arg==min(arg(:)));
disparityMap(i,j)=1-xj(1);
end
end
Three suggestions to try to speed things up:
move the parfor to the outer loop to reduce the overhead of the parallel construct ;
compute im2.^2 once before the loop and save its value in a temporary variable as it does not depend on the loop variables there is no need to compute it again and again, and actually
move the whole computation of arg1 out of the loops as it only depends on the size of template and not its value, and if I see correctly, the size is constant ;
replace the [xj]=find(arg==min(arg(:))); construct with something along the lines of [tmp, ind] = min(arg(:)) ; xj=ind2sub(size(arg), ind) to avoid the call to find and rescan the matrix while the indices can be computed simply.
Untested, but it should give you a start
arg1=conv2(im2.^2,ones([2*win1+1, 2*win2+1])/2,'same');
parfor i=win1+1:1:bottom-win1
for j=win2+1:1:right-win2
%j=[win2+1:bottom-win2];
template=im1(i-win1:i+win1,j-win2:j+win2);
arg2=conv2(im2,rot90(template,2),'same');
arg=arg1-arg2;
[tmp, ind] = min(arg(:)) ;
xj=ind2sub(size(arg), ind);
disparityMap(i,j)=1-xj(1);
end
end
Also make sure the number of workers is chosen appropriately, and try to compile the code to mex to see if there is improvement.
I am dealing with a very huge matrix and so wanted to use parallel computing in MATLAB to run in clusters. Here I have created a sparse matrix using:
Ad = sparse(length(con)*length(uni_core), length(con)*length(uni_core));
I have a written function adj using which I can fill the matrix Ad.
Every time the loop runs, from the function adj I get a square symmetric matrix which is to be assigned to the Ad from 3682*(i-1)+1 to 3682 *(i-1)+3682 in the first index and similarly in the second index. This is shown here:
parfor i = 1:length(con)
Ad((3682*(i-1))+1:((3682*(i-1))+3682), ...
(3682*(i-1))+1:((3682*(i-1))+3682)) = adj(a, b, uni_core);
end
In a normal for loop it is running without any problem. But in parfor in parallel computing I am getting an error that there is a problem in using the sliced arrays with parfor.
Outputs from PARFOR loops must either be reduction variables (e.g. calculating a summation) or "sliced". See this page in the doc for more.
In your case, you're trying to form a "sliced" output, but your indexing expression is too complicated for PARFOR. In a PARFOR, a sliced output must be indexed by: the loop variable for one subscript, and by some constant expression for the other subscripts. The constant expression must be either :, end or a literal scalar. The following example shows several sliced outputs:
x3 = zeros(4, 10, 3);
parfor ii = 1:10
x1(ii) = rand;
x2(ii,:) = rand(1,10);
x3(:,ii,end) = rand(4,1);
x4{ii} = rand(ii);
end
In your case, your indexing expression into Ad is too complicated for PARFOR to handle. Probably the simplest thing you can do is return the calculations as a cell array, and then inject them into Ad on the host side using a regular FOR loop, like so:
parfor i = 1:length(con)
tmpout{i} = ....;
end
for i = 1:length(con)
Ad(...) = tmpout{i};
end
Edric has already explained why you're getting an error, but I wanted to make another suggestion for a solution. The matrix Ad you are creating is made up of a series of 3682-by-3682 blocks along the main diagonal, with zeroes everywhere else. One solution is to first create your blocks in a PARFOR loop, storing them in a cell array. Then you can combine them all into one matrix with a call to the function BLKDIAG:
cellArray = cell(1,length(con)); %# Preallocate the cell array
parfor i = 1:length(con)
cellArray{i} = sparse(adj(a,b,uni_core)); %# Compute matrices in parallel
end
Ad = blkdiag(cellArray{:});
The resulting matrix Ad will be sparse because each block was converted to a sparse matrix before being placed in the cell array.