How to use parallel 'for' loop in Octave or Scilab? - matlab

I have two for loops running in my Matlab code. The inner loop is parallelized using Matlabpool in 12 processors (which is maximum Matlab allows in a single machine).
I dont have Distributed computing license. Please help me how to do it using Octave or Scilab. I just want to parallelize 'for' loop ONLY.
There are some broken links given while I searched for it in google.

parfor is not really implemented in octave yet. The keyword is accepted, but is a mere synonym of for (http://octave.1599824.n4.nabble.com/Parfor-td4630575.html).
The pararrayfun and parcellfun functions of the parallel package are handy on multicore machines.
They are often a good replacement to a parfor loop.
For examples, see
http://wiki.octave.org/Parallel_package.
To install, issue (just once)
pkg install -forge parallel
And then, once on each session
pkg load parallel
before using the functions

In Scilab you can use parallel_run:
function a=g(arg1)
a=arg1*arg1
endfunction
res=parallel_run(1:10, g);
Limitations
uses only one core on Windows platforms.
For now, parallel_run only handles arguments and results of scalar matrices of real values and the types argument is not used
one should not rely on side effects such as modifying variables from outer scope : only the data stored into the result variables will be copied back into the calling environment.
macros called by parallel_run are not allowed to use the JVM
no stack resizing (via gstacksize() or via stacksize()) should take place during a call to parallel_run

In GNU Octave you can use the parfor construct:
parfor i=1:10
# do stuff that may run in parallel
endparfor
For more info: help parfor

To see a list of Free and Open Source alternatives to MATLAB-SIMULINK please check its Alternativeto page or my answer here. Specifically for SIMULINK alternatives see this post.
something you should consider is the difference between vectorized, parallel, concurrent, asynchronous and multithreaded computing. Without going much into the details vectorized programing is a way to avoid ugly for-loops. For example map function and list comprehension on Python is vectorised computation. It is the way you write the code not necesarily how it is being handled by the computer. Parallel computation, mostly used for GPU computing (data paralleism), is when you run massive amount of arithmetic on big arrays, using GPU computational units. There is also task parallelism which mostly refers to ruing a task on multiple threads, each processed by a separate CPU core. Concurrent or asynchronous is when you have just one computational unit, but it does multiple jobs at the same time, without blocking the processor unconditionally. Basically like a mom cooking and cleaning and taking care of its kid at the same time but doing only one job at the time :)
Given the above description there are lot in the FOSS world for each one of these. For Scilab specifically check this page. There is MPI interface for distributed computation (multithreading/parallelism on multiple computers). OpenCL interfaces for GPU/data-parallel computation. OpenMP interface for multithreading/task-parallelism. The feval functions is not parallelism but a way to vectorize a conventional function.Scilab matrix arithmetic and parallel_run are vectorized or parallel depending to the platform, hardware and version of the Scilab.

Related

parallel code by parfor is slower than serial version by for

I have written a code by matlab and i've used parallel computing toolbox
more description about my code:
i'm trying to implement parallel genetic algorithm by matlab and parallel computing toolbox.
i've implemented that but i've a problem. that is my parallel code with parfor is so much slower that serial one with for.
my code:
tic
for j=1:maxIteration
parfor i=1:numIslands
if migrationInterval
doMigration;
end
doCrossover;
doMutation;
newSpring;
end
end
toc
numIslands is always small number (5 to 12)
maxiteration is always big number (1500 to 5000)
please help me
thank you
I recommend you to run your function using "Run and Time" tool. The results will show, if the reason is in parfor-procedure or in your function.
It can be the parfor procedure is unnecesary and gives no advantage, but it depends always on the function you run.
You mention that your CPU has two cores. One problem might being the code itself, it looks as if you are calling scripts instead of functions, so you might be flooding your workspace unnecessarily by doing so. Furthermore, if any of those scripts is declaring variables on the fly, you might be clogging your RAM (Matlab is particularly good at that) thus making your code run slower.
Try to optimize each of the scripts first.
I would really recommend you to use functions instead of scripts though.

Running identical Matlab scripts on multiple local threads

I have a quad-core desktop computer
I have the Parallel Computing toolbox in Matlab.
I have a script file that I need to run simultaneously on each core
I'm not sure what the most efficient way to do this is, I know I can create a 'matlabpool' with 4 local workers, but how do I then assign the same script to each one? Or can I use the 'batch' command to run the script on a specific thread, then do that for each one?
Thank you!
You can run a single script using multiple cores using the Parallel Computing toolbox, by using
matlabpool open local 4
and using parfor instead of for loops to execute whatever is in your loop across four threads. I'm not sure if Parallel Computing toolbox supports running the entirety of the script individually on each core, this will likely not be supported by your hardware.
Not sure if this works, but here is something to try:
When trying to paralelize calculations, they are usually wrapped with something like parfor
So I would recommend doing the same with your script, make sure that all required inputs and outputs have the neccesary dimensions and just call:
parfor ii = 1:4
myscript;
end
Sidenote: Before trying this kind of stuff you may want to check your cpu utilization. If it is already high that means that the inner part of the code uses parallel processing and you should not expect much speedup.

Understanding MATLAB on multiple cores, multiple processors and MPI

I have several closely related questions about how how MATLAB takes advantage of parallel hardware. They are short, so I thought it would be best to put them in the same post:
Does MATLAB leverage/benefit from multiple cores when not using the Parallel Computing Toolbox?
Does MATLAB leverage/benefit from multiple processors when not using the PCT?
Does MATLAB use MPI when not using the PCT?
Does MATLAB use MPI when using the PCT?
Does MATLAB leverage/benefit from multiple cores when not using the
Parallel Computing Toolbox?
Yes. Since R2007a, more and more built-in functions have been re-written to be multi-threaded (though multi-threading will only kick in if it's beneficial).
Element Wise Functions and Expressions:
------------------------------------------------------------------------------------------------
Functions that speed up for double arrays > 20k elements
1) Trigonometric: ACOS(x), ACOSH(x), ASIN(x), ASINH(x), ATAN(x), ATAND(x), ATANH(x), COS(x), COSH(x), SIN(x), SINH(x), TAN(x), TANH(x)
2) Exponential: EXP(x), POW2(x), SQRT(x)
3) Operators: x.^y
For Example: 3*x.^3+2*x.^2+4*x +6, sqrt(tan(x).*sin(x).*3+8);
Functions that speed up for double arrays > 200k elements
4) Trigonometric: HYPOT(x,y), TAND(x)
5) Complex: ABS(x)
6) Rounding and remainder: UNWRAP(x), CEIL(x), FIX(x), FLOOR(x), MOD(x,N), ROUND(x)
7) Basic and array operations: LOGICAL(X), ISINF(X), ISNAN(X), INT8(X), INT16(X), INT32(X)
Linear Algebra Functions:
------------------------------------------------------------------------------------------------
Functions that speed up for double arrays > 40k elements (200 square)
1)Operators: X*Y (Matrix Multiply), X^N (Matrix Power)
2)Reduction Operations : MAX and MIN (Three Input), PROD, SUM
3) Matrix Analysis: DET(X), RCOND(X), HESS(X), EXPM(X)
4) Linear Equations: INV(X), LSCOV(X,x), LINSOLVE(X,Y), A\b (backslash)
5) Matrix Factorizations: LU(X), QR(X) for sparse matrix inputs
6) Other Operations: FFT and IFFT of multiple columns of data, FFTN, IFFTN, SORT, BSXFUN, GAMMA, GAMMALN, ERF,ERFC,ERFCX,ERFINV,ERFCINV, FILTER
For code implemented as .m file, multiple cores won't help, though.
Multi-threaded mex-files will benefit as well, of course.
Does MATLAB use MPI when not using the PCT?
Not to my knowledge.
Does MATLAB use MPI when using the PCT?
Yes, when you run it on a cluster (though you can use other schedulers as well). To do this, you need a license for the Matlab Distributed Computing Server license. I don't know what architecture the local scheduler uses (the one you use when you run parallel jobs on a local machine); given that MPI functions are part of the PCT suggests that they may use it for at least part of the functionality.
EDIT: See #Edric's answer for more details
To clarify and expand on a couple of points from #Jonas' detailed answer:
PCT uses a build of MPICH2 (this is not shipped with base MATLAB).
MPI functions are available under the local scheduler - in this case, the build of MPICH2 can take advantage of shared memory for communication.
The labSend/labReceive family of functions present a wrapper around MPI_Send/MPI_Recv etc.
When not using the PCT, MatLab issues only one command at once (single-threaded).
However, if you have a multi-threaded BLAS, you could still benefit from extra cores (and it doesn't particularly matter whether they're all in a single processor or not).
MEX files can also be written with multiple threads, in which case you will use multiple cores even without the PCT. If you have performance problems, rewriting some of the hotspots as MEX is often a big win.
First, the answers are mostly "No, but...", as #BenVoigt has addressed. The "but..." part comes from libraries used by Matlab. One of the most notable examples was given by Ben, for BLAS, and you can replace this with one that supports multiple cores or processors, such as ATLAS, the Intel or AMD versions, Goto BLAS, or some other options.
You can also call out from Matlab to code in other languages, which can leverage multiple cores, processors, computers, etc. In the past, I have called R from Matlab, and have made use of multiple cores in this way, by taking advantage of R packages that support multicore processing. The same could be done with MPI. However, as you scale, you'll discover that more and more of your code ends up being in the language that can do more parallel or distributed work (i.e. a free language like R, Python, C, C++, or Java), rather than in Matlab.
So, does Matlab benefit from such infrastructure without PCT? Not directly. Can your code in Matlab benefit from such infrastructure via various supporting libraries? Yes.
When not using PCT, MATLAB uses only one core/one processor.
I don't know the answers to the 3rd and 4th questions.

How to utilise parallel processing in Matlab

I am working on a time series based calculation. Each iteration of the calculation is independent. Could anyone share some tips / online primers on using utilising parallel processing in Matlab? How can this be specified inside the actual code?
Since you have access to the Parallel toolbox, I suggest that you first check whether you can do it the easy way.
Basically, instead of writing
for i=1:lots
out(:,i)=do(something);
end
You write
parfor i=1:lots
out(:,i)=do(something);
end
Then, you use matlabpool to create a number of workers (you can have a maximum of 8 on your local machine with the toolbox, and tons on a remote cluster if you also have a Distributed Computing Server license), and you run the code, and see nice speed gains when your iterations are run by 8 cores instead of one.
Even though the parfor route is the easiest, it may not work right out of the box, since you might do your indexing wrong, or you may be referencing an array in a problematic way etc. Look at the mlint warnings in the editor, read the documentation, and rely on good old trial and error, and you should figure it out reasonably fast. If you have nested loops, it's often best parallelize only the innermost one and ensure it does tons of iterations - this is not only good design, it also reduces the amount of code that could give you trouble.
Note that especially if you run the code on a local machine, you may run into memory issues (which might manifest in really slow execution in parallel mode because you're paging): Every worker gets a copy of the workspace, so if your calculation involves creating a 500MB array, 8 workers will need a total 4GB of RAM - and then you haven't even started counting the RAM of the parent process! In addition, it can be good to only use N-1 cores on your machine, so that there is still one core left for other processes that may run on the computer (such as a mandatory antivirus...).
Mathworks offers its own parallel computing toolbox. If you do not want to purchase that, there a few options
You could write your own mex file and use pthreads or OpenMP.
However make sure you do not call any Mex api in the parallel part of the code, because they arent thread safe
If you want coarser grained parallelism via MPI you can try pmatlab
Same with parmatlab
Edit: Adding link Parallel MATLAB with openmp mex files
I have only tried the first.
Don't forget that many Matlab functions are already multithreaded. By careful programming you may be able to take advantage of them -- check the documentation for your version as the Mathworks seem to be increasing the range and number of multithreaded functions with each new release. For example, it seems that 2010a has multithreaded ffts which may be useful for time series processing.
If the intrinsic multithreading is not what you need, then as #srean suggests, the Parallel Computing Toolbox is available. For my money (or rather, my employers' money) it's the way to go, allowing you to program in parallel in Matlab, rather than having to bolt things on. I have to admit, too, that I'm quite impressed by the toolbox and the facilities it offers.

MATLAB and using multiple cores to run calculations

Hey all. Im trying to sort out how to get MATLAB running as best as possible. I have a pretty decent new machine.
12GB RAM
Core i7 3.2Ghz Cpu
and lots of free space.
and a strong graphics card.
However when I run the benchmark test of MATLAB (command bench) it lists the computer as being near the worst, around a Windows XP single core 1.7Ghz machine.
Any ideas why and how I can improve this??
Thanks very much
Firstly, I would recommend re-running the bench command a few times to make sure MATLAB has fully loaded all the libraries etc. it needs. Much of MATLAB is loaded on demand, so it's always best to time the second or third run.
MATLAB automatically takes advantage of multiple cores when executing certain operations which are multithreaded. For example lots of elementwise operations such as +, .* and so on as well as BLAS-backed operations (and probably others). This page lists those things which are multithreaded.
Parallel Computing Toolbox is useful when MATLAB's intrinsic multithreading can't help (if it can, then it's usually the fastest way to do things). This gives you explicit parallelism via PARFOR, SPMD and distributed arrays.
You need the Parallel Processing Toolbox. A lot of MATLAB functions are multithreaded but to parallelize your own code, you'll need it. A dumb hack is to open several instances of command-line MATLAB. You could also write multithreaded MEX files but the right way to go about it would be the purchase and use the aforementioned toolbox.
This may be obvious, but make sure that you have enabled multithreaded computation in the preferences (File > Preferences > General > Multithreading). In some versions of MATLAB, it's not enabled by default.