Matlab parfor and input files - matlab

I have an algorithm myAlgo() which uses a parameter par1 in order to analyze a set of data (about 1000 .mat files). The path to the .mat files is some cell array I pass also to myAlgo(). The myAlgo() function contains classes and other functions. For every value of par1 I have to test all 1000 .mat files. So it would be a lot faster if I could use a parallel loop since I have an independent (?) problem.
I use the following code with parfor:
par1 = linespace(1,10,100);
myFiles % cell array with the .mat file location
myResult = zeros(length(par1),1);
parfor k=1:length(par1)
myPar = par1(k);
myResult(k) = myAlgo(myPar, myFiles);
end
% do something with myResult
.
function theResult = myAlgo(myPar, myFiles)
for ii=1:length(myFiles)
tempResult = initAlgo(myPar, myFiles(ii));
end
theResult = sum(tempResult);
end
So for every parameter in par1 I do the same thing. Unfortunately the processing time does not decrease. But if I check the workload of the CPU (i5), all cores are quiet active.
Now my question: Is it possible, that parfordoes not work in this case, because every worker initialized by parfor needs to access the folder with the 1000 .mat files. Therefore they can not do their job on the same time. Right? So is there a way handle this?

First of all, check if you've got a license for the parallel computing toolbox (PCT). If you do not have one, parfor will behave just like a normal for loop WITHOUT actually parallel processing (for compatibility reasons)..
Second, make sure to open a parpool first.
Another problem may be that you are using parallel processing for the outer loop with 100 iterations, but not for the larger inner loop with 1000 iterations. You should rephrase your problem as one big loop that allows parfor to parallelize the 100*1000=100000 tasks, not just the 100 outer loops. This excellent post explains the problem nicely and offers several solutions.

Related

How Should I Parallelize My Genetic Algorithm Fitness Evaluation?

I have a GA code that I developed myself. Since I'm new to coding, my code is not fast. I have a Dual-Core CPU 2.6GHz.
The only line of the code that takes a long time to run is the fitness function. I am not familiar with the GA toolbox and my fitness function is quite complex so I assume even if I knew how to use the GA toolbox, I would have to code the fitness function myself.
The algoritm's structure is as follows:
after generating the initial generation and evaluating the fitness values (which takes long but does not matter that much because this is only run once), it starts a loop which will be iterated for up to 10000 times. In each iteration, we have a new generation whose fitness values needs to be calculated. So when a new generation of 50 individuals is generated, the whole generation is fed to the fitness_function. In this function there is a for loop which calculates the fitness value for each 50 individual (so the for loop is iterated 50 times). Here is my question. How should I use parfor so that 25 individual is evaluated by one CPU core and the other 25 individuals with the other core, so that the calculation time is decreased to almost half. I already know from here
I have tried changing the for loop in the fitness_function directly to parfor and I have received the following error: "The PARFOR loop cannot run due to the way variable "Z" is used." and "Variable z is indexed in different ways. Potentially causing dependencies between iterations." Variable Z is a 50*3 matrix which stores the fitness values for each of the individuals.
The problem with your assignment into Z is that you have three different assignment statements, and that is not allowed. You need to make the assignment into Z meet the requirements for a "sliced" variable. The easiest way to do this is to make a temporary variable Zrow to store the values for the ith row of Z, and then make a single assignment, like this
parfor i = 1:50
Zrow = zeros(1, 3); % allocate to ensure parfor knows this is a temporary
...
Zrow(1) = TTT;
...
Zrow(2) = sum(FSL,1);
Zrow(3) = 0.5*Z(i,1)+0.5*Z(i,2);
% Finally, make a single sliced assignment into Z
Z(i, :) = Zrow;
end
Also, in general, it's best to have the parfor loop be the outermost one. Also, whether parfor actually gives you any speed-up depends a lot on whether the body of the loop is already being multithreaded by MATLAB's built-in multithreaded capabilities. (If it is, then parfor using only your local machine cannot make things faster because in that case, the multithreaded code is already taking full advantage of your computer's resources).

Read large number of .h5 datasets

I'm working with these h5 files that have tens of thousands of datasets that contains vectors of numerical values and all of the same size. My goal is to read the datasets and create one large matrix from these vectors. The datasets are named from "0" to "xxxxx" (some large number) I was able to read them and get the matrix but it takes forever to do so. I was wondering if you can take a look at my code and suggest a way to make it run faster
here is how I do it right now
t =[];
for i = 0:40400 % there are 40401 datasets in this particular file
j = int2str(i);
p = '/mesh/'; % The parent group
s = strcat(p,j); % to create the full path of a dataset e.g. '/mesh/0'
r = h5read('temp.h5',s); % the file name is temp and s has the dataset path
t = [t;r];
end
in this particular case, there are 40401 datasets, each has 80802x1 vector of numerical values. Therefore eventually I want to create 80802x40401 matrix. This code takes over a day to finish. I think one of the reason it is slow because in every iteration, matlab access the h5 file. I would appreciate it if some of you have some tips in speeding up the code
When I copied you code in an editor, I get the red tilde under the t with the warning:
The variable t appears to change size on every loop iteration. Consider preallocating for speed.
You should allocate the final memory of t before starting the loop, with the function zeros:
t = zeros(80804,40401);
You should also read this: Programming Patterns: Maximizing Code Performance by Optimizing Memory Access:
Preallocate arrays before accessing them within loops
Store and access data in columns
Avoid creating unnecessary variables
Maybe p = '/mesh/'; is useless inside the loop and can be done outside the loop, since it doesn't change. It could be even better to not have p and directly do s = strcat('/mesh/',j);

Parallel Computing in MATLAB using drange

I have a code that goes like this which I want to run using parpool:
result = zeros(J,K)
for k = 1:K
for j = 1:J
build(:,1) = old1(:,j,k)
build(:,2) = old2(:,j,k)
result(j,k) = call_function(build); %Takes a long time to run
end
end
It takes a long time to run this code and I have to run this multiple times for my simulation so I want to run the outermost loop (k = 1:K) in parallel in MATLAB.
From what I have read, I cannot use parfor since all each function uses the same variables old1 and old2. I could use spmd and distribute my matrices old1 and old2. But I read this creates as many copies of the variable as the workers and I do not want this to happen. I could use drange. But I am not sure how it exactly works. I am finding it difficult to actually use what I have been reading in MATLAB references. Any resource and pointers would be of great help!
Constraints are as follows:
Must not create multiple copies of the variables old1, old2. But I can slice it across workers as each iteration doesn't require other iterations.
Have to distribute for the outermost loop only. For ease of accessing data outside this block of code.
Thank you.
old1 and old2 can be used, I think. Initialize as constants using:
old1 = parallel.pool.Constant(old1);
old2 = parallel.pool.Constant(old2);
Have you seen this post?
https://www.mathworks.com/help/distcomp/improve-parfor-performance.html

How to load .mat files in the folder for parfor in MATLAB

I want to run a parfor loop in MATLAB with following code
B=load('dataB.mat'); % B is a 1600*100 matrix stored as 'dataB.mat' in the local folder
simN=100;
cof=cell(1,simN);
se=cell(1,simN);
parfor s=1:simN
[estimates, SE]=fct(0.5,[0.1,0.8,10]',B(:,s));
cof{s}=estimates';
se{s}=SE';
end
However, the codes seem not work - there are no warnings, it is just running forever without any outputs - I terminate the loop and found it never entered into the function 'fct'. Any help would be appreciated on how to load external data like 'dataB.mat' in the parallel computing of MATLAB?
If I type this on my console:
rand(1600,100)
and then I save my current workspace as dataB.mat, this command:
B = load('dataB.mat');
will bring me a 1 by 1 struct containing ans field as a 1600x100 double matrix. So, since in each loop of your application you must extract a column of B before calling the function fct (the extracted column becomes the third argument of your call and it must be defined before passing it)... I'm wondering if you didn't check your B variable composition with a breakpoint before proceeding with the parfor loop.
Also, keep in mind that the first time you execute a parfor loop with a brand new Matlab instance, the Matlab engine must instantiate all the workers... and this may take very long time. Be patient and, eventually, run a second test to see if the problem persists once you are certain the workers have been instantiated.
If those aren't the causes of your issue, I suggest you to run a standard loop (for instead of parfor) and set a breakpoint into the first line of your iteration. This should help you spot the problem very quickly.

Simple parallel execution in MATLAB

I have figured out some awesome ways of speeding up my MATLAB code: vectorizing, arrayfun, and basically just getting rid of for loops (not using parfor). I want to take it to the next step.
Suppose I have 2 function calls that are computationally intensive.
x = fun(a);
y = fun(b);
They are completely independent, and I want to run them in parallel rather than serially. I dont have the parallel processing toolbox. Any help is appreciated.
thanks
If I am optimistic I think you ask "How Can I simply do parallel processing in Matlab". In that case the answer would be:
Parallel processing can most easily be done with the parallel computing toolbox. This gives you access to things like parfor.
I guess you can do:
parfor t = 1:2
if t == 1, x = fun(a); end
if t == 2, y = fun(b); end
end
Of course there are other ways, but that should be the simplest.
The MATLAB interpreter is single-threaded, so the only way to achieve parallelism across MATLAB functions is to run multiple instances of MATLAB. Parallel Computing Toolbox does this for you, and gives you a convenient interface in the form of PARFOR/SPMD/PARFEVAL etc. You can run multiple MATLAB instances manually, but you'll probably need to do a fair bit of work to organise the work that you want to be done.
The usual examples involve parfor, which is probably the easiest way to get parallelism out of MATLAB's Parallel Computing Toolbox (PCT). The parfeval function is quite easy, as demonstrated in this other post. A less frequently discussed functionality of the PCT is the system of jobs and tasks, which are probably the most appropriate solution for your simple case of two completely independent function calls. Spoiler: the batch command can help to simplify creation of simple jobs (see bottom of this post).
Unfortunately, it is not as straightforward to implement; for the sake of completeness, here's an example:
% Build a cluster from the default profile
c = parcluster();
% Create an independent job object
j = createJob(c);
% Use cells to pass inputs to the tasks
taskdataA = {field1varA,...};
taskdataB = {field1varB,...};
% Create the task with 2 outputs
nTaskOutputs = 2;
t = createTask(j, #myCoarseFunction, nTaskOutputs, {taskdataA, taskdataB});
% Start the job and wait for it to finish the tasks
submit(j); wait(j);
% Get the ouptuts from each task
taskoutput = get(t,'OutputArguments');
delete(j); % do not forget to remove the job or your APPDATA folder will fill up!
% Get the outputs
out1A = taskoutput{1}{1};
out1B = taskoutput{2}{1};
out2A = taskoutput{1}{2};
out2B = taskoutput{2}{2};
The key here is the function myCoarseFunction given to createTask as the function to evaluate in the task objects to creates. This can be your fun or a wrapper if you have complicated inputs/outputs that might require a struct container.
Note that for a single task, the entire workflow above of creating a job and task, then starting them with submit can be simplified with batch as follows:
c = parcluster();
jobA = batch(c, #myCoarseFunction, 1, taskdataA,...
'Pool', c.NumWorkers / 2 - 1, 'CaptureDiary', true);
Also, keep in mind that as with matlabpool(now called parpool), using parcluster requires time to startup the MATLAB.exe processes that will run your job.