Running identical Matlab scripts on multiple local threads - matlab

I have a quad-core desktop computer
I have the Parallel Computing toolbox in Matlab.
I have a script file that I need to run simultaneously on each core
I'm not sure what the most efficient way to do this is, I know I can create a 'matlabpool' with 4 local workers, but how do I then assign the same script to each one? Or can I use the 'batch' command to run the script on a specific thread, then do that for each one?
Thank you!

You can run a single script using multiple cores using the Parallel Computing toolbox, by using
matlabpool open local 4
and using parfor instead of for loops to execute whatever is in your loop across four threads. I'm not sure if Parallel Computing toolbox supports running the entirety of the script individually on each core, this will likely not be supported by your hardware.

Not sure if this works, but here is something to try:
When trying to paralelize calculations, they are usually wrapped with something like parfor
So I would recommend doing the same with your script, make sure that all required inputs and outputs have the neccesary dimensions and just call:
parfor ii = 1:4
myscript;
end
Sidenote: Before trying this kind of stuff you may want to check your cpu utilization. If it is already high that means that the inner part of the code uses parallel processing and you should not expect much speedup.

Related

Matlab library code on cluster - not using more than one core/thread?

I am running matlab on a university cluster. The code has no parfor loops but has makes extensive use of vectorized code. So on my local machine when I run the code, the code actually often uses several threads.
However, on the cluster, even though I allocate 76 cores to the program, it never uses more than 1.
I am not sure if there is any specific instruction I need to add to the beginning of the code or to the sbatch command.
Any ideas?
You can use maxNumCompThreads to control the number of computational threads MATLAB will use.

How to specify the number of cores from the linux command line?

I have a MATLAB script using parallel for loops. I want to run my script on a Linux server but I don't know how can I run it from the linux shell without displaying the MATLAB GUI. Also, how can I specify number of cores to use?
matlab -nodesktop
Use maxNumCompThreads to set the total number of threads / cores for MATLAB to use.
If you require MATLAB to run on a single thread, use matlab -singleCompThread. However, I'm not sure why you'd want to control the total number of cores. By default, MATLAB makes use of the multithreading capabilities of the machine it's running on.
As an additional sidenote, maxNumCompThreads will be removed in future releases of MATLAB, so don't rely on this behaviour if you desire longetivity.

Spawn multiple copies of matlab on the same machine

I am facing a huge problem. I built a complex C application with embedded Matlab functions that I call using the Matlab engine (engOpen() and such ...). The following happens:
I spawn multiple instances of this application on a machine, one for each core
However! ... The application then slows down to a halt. In fact, on my 16-core machine, the application slows down approximately by factor 16.
Now I realized this is because there is only a sngle matlab engine started per machine and all my 16 instances share the same copy of matlab!
I tried to replicate this with the matlab GUI and its the same problem. I run a program in the GUI that takes 14 seconds, and THEN I run it in two GUIs at the same time and it takes 28 seconds
This is a huge problem for me, because I will miss my deadline if I have to reprogram my entire c application without matlab. I know that matlab has commands for parallel programming, but my matlab calls are embedded in the C application and I want to run multiple instances of the C application. Again, I cannot refactor my entire c application because I will miss the deadline.
Can anyone please let me know if there is a solution for this (e.g. really start multiple matlab processes on the same machine). I am willing to pay for extra licenses. I currently have fully lincensed matlab installed on all machines.
Thank you so so much!
EDIT
Thank you Ben Voigt for your help. I found that a single instance of Matlab is already using multiple cores. In fact, running one instance shows me full utilization of 4 cores. If I run two copies of Matlab, I get full utilization of 8 cores. Hence it is actually running in parallel. However, even though 2 instances seem to take up double the processing power, I still get 2* slowdown. Hence, 2 instances seem to get twice the result with 4* the compute power total. Why could that be?
Your slowdown is not caused by stuffing all N instances into a single MatLab instance on a single core, but by the fact that there are no longer 16 cores at the disposal of each instance. Many MATLAB vector operations use parallel computation even without explicit parallel constructs, so more than one core per instance is needed for optimal efficiency.
MATLAB libraries are not thread-safe. If you create multithreaded applications, make sure only one thread accesses the engine application.
I think the matlab engine is the wrong technique. For windows platforms, you can try using the com automation server, which has the .Single option which starts one matlab instance for each com client you open.
Alternatives are:
Generate C++ code for the functions.
Create a .NET library. (NE Builder)
Run matlab via command line.

Determine the Maximum Number of Processors Available for matlabpool (MATLAB Parallel Toolbox)

I'm currently writing some code in MATLAB that uses the parfor loop to speed up some tedious calculations.
My issue is that the code will be run on a remote cluster, and could be run on 4-core, 8-core or 12-core machines (I won't know which one in advance)...
I basically need a code snippet that will allow MATLAB to determine the maximum number of cores that can be used in matlabpool. If we call this variable maxcores, I can then go ahead and use
matlabpool('open',maxcores).
so that I can make sure that I am using all the cores that are available to me.
You can get the number of cores on the machine through feature('numCores'), which is undocumented but seems unlikely to break. (source)
Someone claims there that getNumberOfComputationalThreads also works since R2007a, but it doesn't on my R2012a.
Beyond Dougal's response, I found getenv('NUMBER_OF_PROCESSORS') returns the number of threads on my Windows systems.

How to utilise parallel processing in Matlab

I am working on a time series based calculation. Each iteration of the calculation is independent. Could anyone share some tips / online primers on using utilising parallel processing in Matlab? How can this be specified inside the actual code?
Since you have access to the Parallel toolbox, I suggest that you first check whether you can do it the easy way.
Basically, instead of writing
for i=1:lots
out(:,i)=do(something);
end
You write
parfor i=1:lots
out(:,i)=do(something);
end
Then, you use matlabpool to create a number of workers (you can have a maximum of 8 on your local machine with the toolbox, and tons on a remote cluster if you also have a Distributed Computing Server license), and you run the code, and see nice speed gains when your iterations are run by 8 cores instead of one.
Even though the parfor route is the easiest, it may not work right out of the box, since you might do your indexing wrong, or you may be referencing an array in a problematic way etc. Look at the mlint warnings in the editor, read the documentation, and rely on good old trial and error, and you should figure it out reasonably fast. If you have nested loops, it's often best parallelize only the innermost one and ensure it does tons of iterations - this is not only good design, it also reduces the amount of code that could give you trouble.
Note that especially if you run the code on a local machine, you may run into memory issues (which might manifest in really slow execution in parallel mode because you're paging): Every worker gets a copy of the workspace, so if your calculation involves creating a 500MB array, 8 workers will need a total 4GB of RAM - and then you haven't even started counting the RAM of the parent process! In addition, it can be good to only use N-1 cores on your machine, so that there is still one core left for other processes that may run on the computer (such as a mandatory antivirus...).
Mathworks offers its own parallel computing toolbox. If you do not want to purchase that, there a few options
You could write your own mex file and use pthreads or OpenMP.
However make sure you do not call any Mex api in the parallel part of the code, because they arent thread safe
If you want coarser grained parallelism via MPI you can try pmatlab
Same with parmatlab
Edit: Adding link Parallel MATLAB with openmp mex files
I have only tried the first.
Don't forget that many Matlab functions are already multithreaded. By careful programming you may be able to take advantage of them -- check the documentation for your version as the Mathworks seem to be increasing the range and number of multithreaded functions with each new release. For example, it seems that 2010a has multithreaded ffts which may be useful for time series processing.
If the intrinsic multithreading is not what you need, then as #srean suggests, the Parallel Computing Toolbox is available. For my money (or rather, my employers' money) it's the way to go, allowing you to program in parallel in Matlab, rather than having to bolt things on. I have to admit, too, that I'm quite impressed by the toolbox and the facilities it offers.