Convert matlabpool to parpool in matlab - matlab

I have a code from matlab 2010a that I want to run it in matlab 2019a, I'm using parallelism.
matlabpool open 4 %prepares matlab to run in 4 parallel procesors
j1 = batch('parallel1', 'matlabpool', 0);
pause(1)
j2 = batch('parallel2', 'matlabpool', 0);
pause(1)
j3 = batch('parallel3', 'matlabpool', 0);
pause(1)
j4 = batch('parallel4', 'matlabpool', 0);
matlabpool close
But, the code dosen't run in this version of matlab, because I have to use parpool.
So, I'm asking to someone who know how to convert or how to change this part of the code to run in my new matlab version.

The literal translation of your code is to do this:
parpool(4) % Creates a parallel pool with 4 workers
j1 = batch('parallel1', 'Pool', 0) % creates a batch job with no pool
... % etc.
However, I'm curious as to whether this is actually what you want to do. The parpool(4) command launches 4 worker processes to be used by your desktop MATLAB - for when you use parfor, spmd, or parfeval. Each batch command spawns an additional worker process, which cannot access the workers from the parallel pool.

First step is to check the original documentation, since 2010a is no longer online here the corresponding 2013a documentation. It still has matlabpool explained:
'Matlabpool' — An integer specifying the number of workers to make into a MATLAB pool for the job in addition to the worker running the batch job itself. The script or function uses this pool for execution of statements such as parfor and spmd that are inside the batch code. Because the MATLAB pool requires N workers in addition to the worker running the batch, there must be at least N+1 workers available on the cluster. You do not have to have a MATLAB pool already running to execute batch; and the new pool that batch opens is not related to a MATLAB pool you might already have open. (See Run a Batch Parallel Loop.) The default value is 0, which causes the script or function to run on only the single worker without a MATLAB pool.
In current MATLAB-Versions this option is replaced by the pool parameter. 0 Is still the default behavior, you can use:
j1 = batch('parallel1');

Related

Parallel execution of COM instances in Matlab

I'm trying to accelerate our test environment by using the ParralelToolbox of Mathworks. However I am unable to start several Matlab instances in parallel (up to now we run our tests sequentially and each one is starting a new Matlab instance via an ActX server).
So when I run the following code below
ML=ver('Matlab');
ML_Path=matlabroot;
ML_Ver=ML.Version;
parfor i = 1:3
NewMatlab = actxserver(['matlab.application.single.',ML_Ver])
Answer = NewMatlab.Feval('test',1);
NewMatlab.Quit;
NewMatlab.release;
end
the Matlab instances are called sequentially (test is just a very simple script that sums up a few numbers).
However if I start a new Matlab via command line
ML=ver('Matlab');
ML_Path=matlabroot;
ML_Ver=ML.Version;
parfor i = 1:3
dos('matlab -nodesktop -minimize -wait -batch "test"');
end
it works. I see that these two methods are quite different in the handling of starting Matlab, but the first approach would be
If you want each iteration of your test to run in a completely separate MATLAB instance, you could use the batch function, like this:
for i = 1:3
j(i) = batch(#test, nOut, {argsIn...});
end
% Later, collect results
for i = 1:3
wait(j(i)), fetchOutputs(j(i))
end
Or, you could simply use parfor directly
parpool() % If necessary
parfor i = 1:3
out{i} = test(...)
end
(You only need to call parpool if no pool is currently open, and you have your preferences set so that a pool is not automatically created when you hit the parfor).

How many times does a loop run in Batch with a pool of 2?

I have a for loop, that generates a 1 by n array, and then saves that array as a mat file called "Batch_Test_N (For Loop Iteration Number)". If I were to run this in batch with a parallel pool of 2, is each mat file generated 2 times, or is each loop run once? For example, does worker 1 run through the entire for loop, and then worker 2 also runs through the entire for loop, or does worker 1 only do iteration 1,3,...9, and worker 2 does the rest?
for i=1:10
filename=['Batch_Test_',num2str(i)];
Array=ones(1,i);
save(filename,'Array')
end
job=batch('Script_Name','Pool',2)
#David already gave the correct answer in a comment, this answer is just to expand a little on that.
When you run batch('<script>', 'Pool', N), it's basically as if you ran
parpool(N);
<script>
in MATLAB - in other words, your script gets executed with an open parallel pool of size N. Note therefore that batch(..., 'Pool', N) uses N+1 workers on the cluster.
Therefore, as #David correctly points out - to get parallelism on the cluster, your script needs to contain parallel language constructs such as parfor, spmd, or parfeval.This is described in the doc: https://uk.mathworks.com/help/parallel-computing/run-a-batch-job.html#bu62o45

Does Matlab produce the same Random numbers every time the parallel toolbox is used

Does starting the parallel tool box cause the random number generators to produce the same random numbers?
I seem to get the same results even when i don't actually use parfor? i.e. I open the matlab pool but use for anyway.
Then when I re-run the results can be different even if i reopen the matlab pool again?
Baz
MATLAB itself always initializes its random number generators in precisely the same way each time you start it. This is to allow you to reproduce results should you need to. For instance, in R2013b, on both WIN64 and GLNXA64, the very first return from rand() is 0.8147....
Likewise, Parallel Computing Toolbox workers have deterministic random number initialization. So, we see the following (in R2013b using the new parpool syntax)
>> parpool('local', 3); spmd, rand, end
Starting parallel pool (parpool) using the 'local' profile ... connected to 3 workers.
Lab 1:
ans =
0.3246
Lab 2:
ans =
0.2646
Lab 3:
ans =
0.8847
There's more info (including details about gpuArray random numbers) in the doc.
There's also some potentially useful info in this c.s-s.m thread.

How do I create a parallel loop?

You'd think this would be simple question, but I can't find the solution. Take the following loop:
A = zeros(1,10000000);
parfor i = 1:length(A)
A(i) = i;
end
This only runs on a single core on my computer, although it's readily parallelisable (or at least it should be). I am using Matlab 2012b, and I've tried looking for documentation on how to create parallel loops but can't find any (the matlab docs just show examples of how to create these loops, not how to actually run them in parallel).
I've tried looking up how to modify parallel computing toolbox settings, but none of them work since they're all for Matlab 2013 (I'm using 2012b). If someone could provide an example of a trivial, parallelisable loop that actually runs in parallel I would be very grateful!
Note: I have checked and the parallel computing toolbox is installed, although I have no way of knowing if it is enabled, or how to enable it, since the documentation doesn't seem to provide an answer to this for my version (I typed preferences into the command prompt but didn't see it there).
EDIT: I got it working by doing this:
matlabpool('open',4);
A = zeros(1,10000000);
parfor i = 1:length(A)
A(i) = i;
end
matlabpool('close');
... but I don't really know why this works, whether I have close the pool every time, what a pool actually is (I've read the documnentation, still don't get it), and how matlabpool differs from parpool...
Like I said in my comment, you need to launch the MATLAB workers:
matlabpool open N
The parpool command replaced the matlabpool command in version R2013b. The command creates a number of local workers (assuming your default cluster is the local profile), which are simply MATLAB.exe processes running without a GUI, that execute parts of parallelized code, like your parfor loop.
It is not necessary needed to close the pool. In some cases you may wish to keep it open for later reuse (as opening also takes some time). Testing for a zero pool size can be helpful to decide, if a new matlabpool needs to be open:
A = zeros(1,10000000);
if matlabpool('size') == 0
matlabpool('open',4) ;
end
parfor i = 1:length(A)
A(i) = i;
end
Since the change from matlabpool to parpool, there is an even easier way to create the pool. Unlike parpool, it doesn't throw an error if the pool already exists. Just call gcp (which stands for "get current pool").
gcp();
A = zeros(1,10000000);
parfor i = 1:length(A)
A(i) = i;
end
It is good practice to always leave the pool open; this just ensures that it's open when you need it.

How to run Matlab computations in parallel

I have Matlab .m script that sets and trains Neural network ("nn") using Matlab's Neural network toolbox. The script launches some GUI that shows trainig progress etc. The training of nn usually takes long time.
I'm doing these experiments on computer with 64 processor cores. I want to train several networks at the same time without having to run multiple Matlab sessions.
So I want to:
Start training of neural network
Modify script that creates network to create different one
Start training of modified network
Modify script to create yet another network...
Repeat steps 1-4 several times
The problem is that when I run the scrip it blocks Matlab terminal so I cannot do anything else until the script executes its last command - and that takes long. How can I run all those computations in parallel? I do have Matlab parallel toolbox.
EDIT: Matlab bug??
Update: This problem seems to happen only on R2012a, looks like fixed on R2012b.
There is very strange error when I try command sequence recommended in Edric's answer.
Here is my code:
>> job = batch(c, #nn, 1, {A(:, 1:end -1), A(:, end)});
>> wait(job);
>> r = fetchOutputs(job)
Error using parallel.Job/fetchOutputs (line 677)
An error occurred during execution of Task with ID 1.
Caused by:
Error using nntraintool (line 35)
Java is not available.
Here are the lines 27-37 of nntraintool (part of Matlab's Neural networks toolkit) where error originated:
if ~usejava('swing')
if (nargin == 1) && strcmp(command,'check')
result = false;
result2 = false;
return
else
disp('java used');
error(message('nnet:Java:NotAvailable'));
end
end
So it looks like the problem is that GUI (because Swing is not available) cannot be used when job is executed using batch command. The strange thing is that the nn function does not launch any GUI in it's current form. The error is caused by train that launches GUI by default but in nn I have switched that off:
net.trainParam.showWindow = false;
net = train(net, X, y);
More interestingly if the same nn function is launched normally (>> nn(A(:, 1:end -1), A(:, end));) it never enters the outer if-then statement of nntraintool on line 27 (I have checked that using debugger). So using the same function, the same arguments expression ~usejava('swing') evaluates to 0 when command is launched normally but to 1 when launched using batch.
What do you think about this? It looks like ugly Matlab or Neural networks toolbox bug :(((
With Parallel Computing Toolbox, you can run up to 12 'local workers' to execute your scripts (to run more than that, you'd need to purchase additional MATLAB Distributed Computing Server licences). Given your workflow, the best thing might be to use the BATCH command to submit a series of non-interactive jobs. Note that you will not be able to see any GUI from the workers. You might do something like this (using R2012a+ syntax):
c = parcluster('local'); % get the 'local' cluster object
job = batch(c, 'myNNscript'); % submit script for execution
% now edit 'myNNscript'
job2 = batch(c, 'myNNscript'); % submit script for execution
...
wait(job); load(job) % get the results
Note that the BATCH command automatically attaches a copy of the script to run to the job, so that you are free to make changes to it after submission.