You'd think this would be simple question, but I can't find the solution. Take the following loop:
A = zeros(1,10000000);
parfor i = 1:length(A)
A(i) = i;
end
This only runs on a single core on my computer, although it's readily parallelisable (or at least it should be). I am using Matlab 2012b, and I've tried looking for documentation on how to create parallel loops but can't find any (the matlab docs just show examples of how to create these loops, not how to actually run them in parallel).
I've tried looking up how to modify parallel computing toolbox settings, but none of them work since they're all for Matlab 2013 (I'm using 2012b). If someone could provide an example of a trivial, parallelisable loop that actually runs in parallel I would be very grateful!
Note: I have checked and the parallel computing toolbox is installed, although I have no way of knowing if it is enabled, or how to enable it, since the documentation doesn't seem to provide an answer to this for my version (I typed preferences into the command prompt but didn't see it there).
EDIT: I got it working by doing this:
matlabpool('open',4);
A = zeros(1,10000000);
parfor i = 1:length(A)
A(i) = i;
end
matlabpool('close');
... but I don't really know why this works, whether I have close the pool every time, what a pool actually is (I've read the documnentation, still don't get it), and how matlabpool differs from parpool...
Like I said in my comment, you need to launch the MATLAB workers:
matlabpool open N
The parpool command replaced the matlabpool command in version R2013b. The command creates a number of local workers (assuming your default cluster is the local profile), which are simply MATLAB.exe processes running without a GUI, that execute parts of parallelized code, like your parfor loop.
It is not necessary needed to close the pool. In some cases you may wish to keep it open for later reuse (as opening also takes some time). Testing for a zero pool size can be helpful to decide, if a new matlabpool needs to be open:
A = zeros(1,10000000);
if matlabpool('size') == 0
matlabpool('open',4) ;
end
parfor i = 1:length(A)
A(i) = i;
end
Since the change from matlabpool to parpool, there is an even easier way to create the pool. Unlike parpool, it doesn't throw an error if the pool already exists. Just call gcp (which stands for "get current pool").
gcp();
A = zeros(1,10000000);
parfor i = 1:length(A)
A(i) = i;
end
It is good practice to always leave the pool open; this just ensures that it's open when you need it.
Related
I have a code from matlab 2010a that I want to run it in matlab 2019a, I'm using parallelism.
matlabpool open 4 %prepares matlab to run in 4 parallel procesors
j1 = batch('parallel1', 'matlabpool', 0);
pause(1)
j2 = batch('parallel2', 'matlabpool', 0);
pause(1)
j3 = batch('parallel3', 'matlabpool', 0);
pause(1)
j4 = batch('parallel4', 'matlabpool', 0);
matlabpool close
But, the code dosen't run in this version of matlab, because I have to use parpool.
So, I'm asking to someone who know how to convert or how to change this part of the code to run in my new matlab version.
The literal translation of your code is to do this:
parpool(4) % Creates a parallel pool with 4 workers
j1 = batch('parallel1', 'Pool', 0) % creates a batch job with no pool
... % etc.
However, I'm curious as to whether this is actually what you want to do. The parpool(4) command launches 4 worker processes to be used by your desktop MATLAB - for when you use parfor, spmd, or parfeval. Each batch command spawns an additional worker process, which cannot access the workers from the parallel pool.
First step is to check the original documentation, since 2010a is no longer online here the corresponding 2013a documentation. It still has matlabpool explained:
'Matlabpool' — An integer specifying the number of workers to make into a MATLAB pool for the job in addition to the worker running the batch job itself. The script or function uses this pool for execution of statements such as parfor and spmd that are inside the batch code. Because the MATLAB pool requires N workers in addition to the worker running the batch, there must be at least N+1 workers available on the cluster. You do not have to have a MATLAB pool already running to execute batch; and the new pool that batch opens is not related to a MATLAB pool you might already have open. (See Run a Batch Parallel Loop.) The default value is 0, which causes the script or function to run on only the single worker without a MATLAB pool.
In current MATLAB-Versions this option is replaced by the pool parameter. 0 Is still the default behavior, you can use:
j1 = batch('parallel1');
TL;DR: How should custom simulation runs be managed in Matlab? Detailed Questions at the end.
I am working with matlab where i created some code to check the influence of various parameters on a simulated system. It has a lot of inputs and outputs but a MWE would be:
function [number2x,next_letter] = test(number, letter)
number2x = number * 2;
next_letter = letter + 1;
disp(['next letter is ' next_letter])
disp(['number times 2 is ' num2str(number2x)])
end
This works if this is all there is to test. However with time multiple new inputs and outputs had to be added. Also because of the growing number of paramters that have been test some sort of log had to be created:
xlswrite('testfile.xlsx',[num2str(number), letter,num2str(number2x),next_letter],'append');
Also because the calculation takes a few hours and should run over night multiple parameter sets had to be started at one point. This is easily done with [x1,y1] = test(1,'a');[x2,y2] = test(2,'b'); in one line or adding new tasks while the old still run. However this way you can't keep track on how many are still open.
So in total I need some sort of testing framework, that can keep up with changeging inpus and outputs, keeps track on already doen calculations and ideally also handles the open runs.
I feel like i can't be the only one who faces this issue, in fact I think so many people face this issue that Mathworks would already came up with a solution.
For Simulink this has been done in form of a Simluationmanager, but for Matlab functions the closest thing i found is the Testing framework (example below) which seems to be rather for software development and debugging and not at all for what i am trying. And somepoint there seem to be 3rd party solutions but they are no longer continued in favor of this Testing framework.
function solutions = sampleTest
solutions = functiontests({#paramtertest});
end
function paramtertest(vargin)
test(1,'a');
test(2,'b');
end
function [number2x,next_letter] = test(number, letter)
number2x = number * 2;
next_letter = letter + 1;
disp(['next letter is ' next_letter])
disp(['number times 2 is ' num2str(number2x)])
xlswrite('testfile.xlsx',[num2str(number), letter,num2str(number2x),next_letter],'append');
end
Alternatively I could create my test as a class, create an interface similar to the Simulationmanager, create numerous functions for managing inputs and outputs and visualize the progress and then spawn multiple instances of if i want to set up a new set of parameters while already running a simulation. Possible, yet a lot of work that does not involve the simulation directly.
In total following questions arise:
Is there a build in Matlab function for managing simulations that i totally missed so far?
Can the the Testing framework be used for this purpose?
Is there already some Framework (not from Mathworks) that can handle this?
If i create my own class, could multiple instances run individually and keep track of their own progress? And would those be handled simultaniously or would matlab end up running the in the order they started?
I know this question is somewhat in the off-topic: recommend or find a tool, library or favorite off-site resource area. If you feel it is too much so, please focus on the last question.
Thank you!
I've done similar tests using GUI elements. Basic part of simulation was inside while loop, for example in your case:
iter = 0;
iter_max = 5; %the number of your times, you will call script
accu_step = 2; %the accuracy of stored data
Alphabet = 'abcdefghijklmnopqrstuvwxyz'
while iter < iter_max
iter = iter+1;
[x1,y1] = test(i,Alphabet(i));
end
Now you should create a handle to progress bar inside your computation script. It will show you both on which step you are, and the progress of current step.
global h;
global iter_opt;
if isempty(h)
h=waitbar(0,'Solving...');
else
waitbar(t/t_end,h,sprintf('Solving... current step is:%d',iter));
end
You didn't specified which function you use, if it is for example time-depended the above t/t_end example is an estimation of current progress.
The solving of result also require to be changed on every execution of loop, for example:
global iter;
i_line = (t_end/accu_step+2)*(iter-1);
xlswrite('results.xlsx',{'ITERATION ', iter},sheet,strcat('A',num2str(i_line+5)))
xlswrite('results.xlsx',results_matrix(1:6),sheet,strcat('D',num2str(i_line+5)))
The above example were also with assumption that your results are time-related, so you store data every 2 units of time (day, hours, min, what you need), from t_0 to t_end, with additional 2 rows of separation, between steps. The number of columns is just exemplary, you can adjust it to your needs.
After the calculation is done, you can close waitbar with:
global h
close(h)
while using Matlab parfor I came across the following behaviour
parpool(2)
parfor j=1:100
v = j+1;
clear v
end
> Error in ==> parallel_function>make_general_channel/channel_general at 886
> Transparency violation error.
I looked into it, and indeed one is not allowed to use clear within parfor.
My question is why. v is created inside every specific worker, and so it does not interfere with other workers.
Matlab uses static code analyzer to understand how the body of parfor loop interacts with main workspace, i.e. which variables need to be transferred to workers and back. A number of functions, such as eval, evalc, evalin, assignin (with the workspace argument specified as 'caller'), load (unless the output is assigned to a variable), save and clear can modify workspace in ways that cannot be predicted by the static analyzer. There is no way to ensure integrity of the workspace when multiple workers are operating on it, and such functions are used.
Important thing to realize is that when you use a command syntax to invoke a function, such as clear v, the argument is passed as a string literal, meaning there is no way for the static analyzer to understand which variable you are trying to clear, hence no way to figure out the effect the command will have on the workspace.
As suggested in documentation, the workaround to free up most of the memory used by a variable inside parfor is: v = [];
I have figured out some awesome ways of speeding up my MATLAB code: vectorizing, arrayfun, and basically just getting rid of for loops (not using parfor). I want to take it to the next step.
Suppose I have 2 function calls that are computationally intensive.
x = fun(a);
y = fun(b);
They are completely independent, and I want to run them in parallel rather than serially. I dont have the parallel processing toolbox. Any help is appreciated.
thanks
If I am optimistic I think you ask "How Can I simply do parallel processing in Matlab". In that case the answer would be:
Parallel processing can most easily be done with the parallel computing toolbox. This gives you access to things like parfor.
I guess you can do:
parfor t = 1:2
if t == 1, x = fun(a); end
if t == 2, y = fun(b); end
end
Of course there are other ways, but that should be the simplest.
The MATLAB interpreter is single-threaded, so the only way to achieve parallelism across MATLAB functions is to run multiple instances of MATLAB. Parallel Computing Toolbox does this for you, and gives you a convenient interface in the form of PARFOR/SPMD/PARFEVAL etc. You can run multiple MATLAB instances manually, but you'll probably need to do a fair bit of work to organise the work that you want to be done.
The usual examples involve parfor, which is probably the easiest way to get parallelism out of MATLAB's Parallel Computing Toolbox (PCT). The parfeval function is quite easy, as demonstrated in this other post. A less frequently discussed functionality of the PCT is the system of jobs and tasks, which are probably the most appropriate solution for your simple case of two completely independent function calls. Spoiler: the batch command can help to simplify creation of simple jobs (see bottom of this post).
Unfortunately, it is not as straightforward to implement; for the sake of completeness, here's an example:
% Build a cluster from the default profile
c = parcluster();
% Create an independent job object
j = createJob(c);
% Use cells to pass inputs to the tasks
taskdataA = {field1varA,...};
taskdataB = {field1varB,...};
% Create the task with 2 outputs
nTaskOutputs = 2;
t = createTask(j, #myCoarseFunction, nTaskOutputs, {taskdataA, taskdataB});
% Start the job and wait for it to finish the tasks
submit(j); wait(j);
% Get the ouptuts from each task
taskoutput = get(t,'OutputArguments');
delete(j); % do not forget to remove the job or your APPDATA folder will fill up!
% Get the outputs
out1A = taskoutput{1}{1};
out1B = taskoutput{2}{1};
out2A = taskoutput{1}{2};
out2B = taskoutput{2}{2};
The key here is the function myCoarseFunction given to createTask as the function to evaluate in the task objects to creates. This can be your fun or a wrapper if you have complicated inputs/outputs that might require a struct container.
Note that for a single task, the entire workflow above of creating a job and task, then starting them with submit can be simplified with batch as follows:
c = parcluster();
jobA = batch(c, #myCoarseFunction, 1, taskdataA,...
'Pool', c.NumWorkers / 2 - 1, 'CaptureDiary', true);
Also, keep in mind that as with matlabpool(now called parpool), using parcluster requires time to startup the MATLAB.exe processes that will run your job.
I am trying to to some computations and I would like to do it in parallel using parfor or by Opening the matlabpool.. as the current implementations is too slow:
result=zeros(25,16000);
for i = 1:length(vector1) % length is 25
for j = 1:length(vector2) % length is 16000
temp1 = vector1(i);
temp2 = vector2(j);
t1 = load(matfiles1(temp1).name) %load image1 from matfile1
t2 = load(matfiles2(temp2).name) % load image2 from matfile2
result(i,j)=t1.*t2
end
end
its works fine but I would really like to know if there is a way to speed thing up ...
Thanks a lot in advance!
Using a parfor loop and opening a matlabpool go together. Opening the matlabpool provides your MATLAB session with dedicated workers with which it can run the body of your parfor loop. So, you could change your code to something like this:
matlabpool open local 4 % or however many cores you have
parfor i = ...
...
end
Before running your code in parallel, I would definitely recommend using the MATLAB profiler to ensure you understand where the time is being spent running your code. (I'm a little surprised that hoisting the load into t1 into the outer loop has no effect - the profiler presumably should therefore show that the load calls take very little time compared to the rest of your algorithm).