Parallelize MATLAB for loop to calculate MLE - matlab

I am attempting to speed up my MATLAB code by using parfor, however, I am doing it incorrectly. My code is rather simple, I am fitting some data using MATLAB's built-in mle function by using varying initial guesses for the mean (mm) and variance (vv). onestagepdf2 is my probability density function.
Here is the code snippit:
mm=linspace(.1, 1, 2); % mean
vv=linspace(.1, 2, 2); % variance
N=length(mm);
n=length(vv);
pd=zeros(n*N,2);
ld = NaN*ones(n*N,1);
options = statset('MaxIter',10000, 'MaxFunEvals',10000);
parfor i=1:N % pick a mean
m=mm(i);
parfor j=1:n % pick a variance
v=vv(j);
x0=[m,v];
[p,conf1]=mle(data,'pdf',#onestagepdf2,'start',x0, 'upperbound', [Inf Inf],'lowerbound',[0 0],'options',options)
pd(n*(i-1)+j,:)=p; % store parameter values from mle
l=onestagepdf2(data,p(1),p(2)); % evaluate pdf with parameter values
ld(n*(i-1)+j)=sum(log(l)); % store likelihood value
end
end
The error that I receive is:
'The variable pd in a parfor cannot be classified.'

pd = zeros(n, N, 2); %initialise culprits
ld= zeros(n,N);
parfor ii=1:N % pick a mean
m=mm(ii);
for jj=1:n % Parallellise the second parfor
v=vv(jj);
x0=[m,v];
[p,conf1]=mle(data,'pdf',#onestagepdf2,'start',x0, 'upperbound', [Inf Inf],'lowerbound',[0 0],'options',options)
pd(ii, jj, :) = p;=p; % store parameter values from mle
l=onestagepdf2(data,p(1),p(2)); % evaluate pdf with parameter values
ld(ii,jj)=sum(log(l)); % store likelihood value
end
end
Your pd was indeed the culprit, as #Trilarion stated, for the error you got. Probably ld also isn't too great, with the same syntax, so initialise that as well. This happened because parfor wants to know what the size of all variables within the loop is before executing. That this is a fixed maximum size is unknown to MATLAB, since you are using the loop variables to "change" the size.
Probably you had "orange wiggles" below those two lines (like spell-check errors) when you ran this as a for loop saying that "pd appears to be growing in size each iteration. Please consider preallocating for speed". This is required by parfor, since the order of iteration is not sequential it is impossible to grow arrays this way.
Secondly, you cannot nest parfor loops. You can use things like a function with a parfor and run that within the parfor, but that won't get you a speed-up since you are already using all your workers.
See Saving time and memory using parfor in Matlab? for more general information on parfor especially on speed.

You want a sliced, output variable but Matlab is not clever enough to detect that n*(i-1)+j is actually reasonable and won't interfere with an asynchronous evaluation.
Just do it as separate dimensions
pd = zeros(n, N, 2);
...
% in the loop
pd(i, j, :) = p;
That will work.
Please note, that Matlab does not allow nested parfors. However, you also do not need them if N is larger than the number of workers. See also the documentation.

Related

Matlab parallel code with many anonymous functions leading to memory errors

I have a code that solves a scientific problem with many different inputs/parameters. I'm using a parallel for loop to iterate through a range of parameters, and running into trouble with memory usage. I've done my best to put together a MWE that represents my code.
Basically, for each parameter combination I run a small loop over several different solver options. In my real code, this is changing solver tolerances and the equations used (we have a few different transformation which can help conditioning). Each computation is effectively a shooting method for a small ODE system (3 equations, but each is quite complicated and generally stiff), with an optimisation routine calling the ODE solver. This takes seconds/minutes to run each time, the the parallelisation overhead is negligible, and the speedup scales pretty much exactly with the number of cores.
To explain the code below, start with driver. First define some parameters (a and f in the MWE) and save them in a file. The filename gets passed around between functions. Then create the 3 (in this case) sets of solver parameters, which choose the ode solver, tolerance, and set of equations to use. Then enter the for loop, looping over some other parameter c, at each iteration using each of the sets of solver parameters to call the optimisation function. Finally, I save a temporary file with the results of each iteration (so I don't lose everything if the server goes down). These files are about 1kB, and I will only have around 10,000 of them, so the overall size is on the order of 10MB. After the main loop I recombine everything back into single vectors.
The equations function creates the actual differential equations to solve, this is done using a switch statement to choose which equations to return. The objectiveFunction function uses str2func to specific the ODE solver, calls equations to get the equations to solve, then solves them and computes an objective function value.
The problem is that there appears to be some sort of memory leak. After some time, on the order of days, the code slows down and finally gives an out-of-memory error (running on 48 cores with ~380GB memory available, ode15s gave the error). The increase in memory usage over time is fairly gradual, but is definitely there, and I can't figure out what is causing it.
The MWE with 10,000 values c takes quite a while to run (1,000 is probably sufficient actually), and the memory usage per worker does increase over time. I think the file loading/saving and job distribution cause quite a lot of overhead, unlike my actual code, but this doesn't affect memory usage.
My question is, what could be causing this slow increase in memory usage?
My ideas for what is causing the problem are:
Using str2func isn't great, should I use a switch instead and accept having to write the solvers into the code explicitly?
All the anonymous functions getting called all the time (in the ODE solver) are holding on to workspace data and not releasing it at the end of each parfor iteration
Suppressed warnings are causing issues: I suppress lots of ODE step size warnings (this shouldn't be a factor because the bug that means this causes issues was fixed in 2017a, and the server I use runs 2017b)
Something in fminbnd or ode15s is actually leaking memory
I can't come up with a way to get around 1 and 2 nicely and efficiently (both from a code performance and code writing point of view), and I doubt 3 or 4 are actually the problem.
Here is the driver function:
function [xi,mfv] = driver()
% a and f are used in all cases. In actual code these are defined in a
% separate function
paramFile = 'params';
a = 4;
f = #(x) 2*x;
% this filename (params) gets passed around from function to function
save('params.mat','a','f')
% The struct setup has specifc options for the each iteration
setup(1).method = 'ode45'; % any ODE solver can be used here
setup(1).atol = 1e-3; % change the ODE solver tolerance
setup(1).eqs = 'second'; % changes what equations are solved
setup(2).method = 'ode15s';
setup(2).atol = 1e-3;
setup(2).eqs = 'second';
setup(3).method = 'ode15s';
setup(3).atol = 1e-4;
setup(3).eqs = 'first';
c = linspace(0,1);
parfor i = 1:numel(c) % loop over parameter c
xi = 0;
minFVal = inf;
for j = 1:numel(setup) % loop over each set configuration setup
% find optimal initial condition and record corresponding value of
% objective function
[xInitial,fval] = fminsearch(#(x0) objectiveFunction(x0,c(i),...
paramFile,setup(j)),1);
if fval<minFVal % keep the best solution
xi = xInitial;
minFVal = fval;
end
end
% save some variables
saveInParForLoop(['tempresult_' num2str(i)],xi,minFVal);
end
% Now combine temporary files into single vectors
xi = zeros(size(c)); mfv = xi;
for i = 1:numel(c)
S = load(['tempresult_' num2str(i) '.mat'],'xi','minFVal');
xi(i) = S.xi;
mfv(i) = S.minFVal;
end
% delete the temporary files now that the data has been consolidated
for i = 1:numel(c)
delete(['tempresult_' num2str(i) '.mat']);
end
end
function saveInParForLoop(filename,xi,minFVal)
% you can't save directly in a parfor loop, this is the workaround
save(filename,'xi','minFVal')
end
Here is the function to define the equations
function [der,transform] = equations(paramFile,setup)
% Defines the differential equation and a transformation for the solution
% used to calculate the objective function
% Note in my actual code I generate these equations earlier
% and pass them around directly, rather than always redefining them
load(paramFile,'a','f')
switch setup.eqs
case 'first'
der = #(x) f(x)*2+a;
transform = #(x) exp(x);
case 'second'
der = #(x) f(x)/2-a;
transform = #(x) sqrt(abs(x));
end
and here is the function to evaluate the objective function
function val = objectiveFunction(x0,c,paramFile,setup)
load(paramFile,'a')
% specify the ODE solver and AbsTol from s
solver = str2func(setup.method);
options = odeset('AbsTol',setup.atol);
% get the differential equation and transform equations
[der,transform] = equations(paramFile,setup);
dxdt = #(t,y) der(y);
% solve the IVP
[~,y] = solver(dxdt,0:.05:1,x0,options);
% calculate the objective function value
val = norm(transform(y)-c*a);
If you run this code it will create 100 temporary files, then delete them, and it will also create the params file, which won't be deleted. You will need the parallel computing toolbox.
There's just a chance you might be running into this known problem: https://uk.mathworks.com/support/bugreports/1976165 . This is marked as being fixed in R2019b, which has just been released. (The leak caused by this is tiny but persistent - so it might indeed take days to become apparent).

How to avoid variable broadcasting in a parfor loop?

I'm using parfor, unfortunately two of my variables, x and y, which are both matrices, get broadcasted and don't know how to avoid it. I've read about it in the MATLAB help, but couldn't figure out a solution. How can I prevent broadcasting of x and y?
Here is my code :
parfor k=1:length(Lambda)
lambda=Lambda(k);
for p=1:length(Gamma)
gamma=Gamma(p);
for Fold=1:size(Fold_indices,2)
x_Train=x(logical(Fold_indices(:,Fold)),1:end);
Y_Train=y(logical(Fold_indices(:,Fold)),1:Num_Tasks);
% Do sth with x_Train and Y_train
end
end
end
I've tried to slice broadcasted data(x) into a cell array and it didn't solve the problem a well.
B=cell(1,J);
% Fill each entry of B with a matrix
% ...do it here
....
parfor k=1:length(Lambda)
lambda=Lambda(k);
for p=1:length(Gamma)
gamma=Gamma(p);
for Fold=1:J)
x_Train=B{1,J};
% Do sth with x_Train and Y_train
end
end
end
Interestingly, when I assign the broadacaste variable(B) to other varible(D), then it stops getting brodcated.
B=cell(1,J);
% Fill each entry of B with a matrix
% ...do it here
....
parfor k=1:length(Lambda)
D=B;
lambda=Lambda(k);
for p=1:length(Gamma)
gamma=Gamma(p);
for Fold=1:J)
x_Train=B{1,J};
% Do sth with x_Train and Y_train
end
end
end
First off: you need to broadcast. Each worker is a separate MATLAB instance, and it needs the data. Sending data to a worker's MATLAB instance is called broadcasting. So there's no preventing it when you use parallel computing, it's the core of it even.
Second: you can't avoid broadcasting x and y in its entirety here, since you use all of it in each separate parfor iteration. Avoiding broadcasting as a whole requires you to not need all of the matrix in each loop iteration, in which case you can slice your variables, as is presented in this answer; i.e. you'll have to rewrite your code in such a manner that you do not require all of x and y to be on each separate worker.

Matlab. Poisson fit. Factorial

I have a histogram that seems to fit a poisson distribution.
In order to fit it, I declare the function myself as follows
xdata; ydata; % Arrays in which I have stored the data.
%Ydata tell us how many times the xdata is repeated in the set.
fun= #(x,xdata) (exp(-x(1))*(x(1).^(xdata)) )/(factorial(xdata)) %Function I
% want to use in the fit. It is a poisson distribution.
x0=[1]; %Approximated value of the parameter lambda to help the fit
p=lsqcurvefit(fun,x0,xdata,ydata); % Fit in the least square sense
I find an error. It probably has to do with the "factorial". Any ideas?
Factorial outputs a vector from vector xdata. Why are you using .xdata in factorial?
For example:
data = [1 2 3];
factorial(data) is then [1! 2! 3!].
Try ./factorial(xdata) (I cannot recall if the dot is even necessary at this case.)
You need to use gamma(xdata+1) function instead of factorial(xdata) function. Gamma function is a generalized form of factorial function which can be used for real and complex numbers. Thus, your code would be:
fun = #(x,xdata) exp(-x(1))*x(1).^xdata./gamma(xdata+1);
x = lsqcurvefit(fun,1,xdata,ydata);
Alternatively, you can MATLAB fitdist function which is already optimized and you might get better results:
pd = fitdist(xdata,'Poisson','Frequency',ydata);
pd.lambda

Matlab optimization: what types of objective functions are 'allowed' with fminsearch.m and Co.?

Examples for optimizations with functions like fmincon.m and fminsearchbnd.m usually minimize objective functions that are relatively simple. With simple I mean that the objective function only consists of some algebraic expression, e.g. the Rosenbrock formula.
In my problem, on the other hand, the objective function consists of several steps, including
computing an L2-norm misfit between an observed data point and a set of n training data points (n~5e4)
selecting those data points from the training data set that give the lowest misfit
then using the row indices of this selected subset to compute the final distance that I intend to minimize.
i.e. I perform operations that cannot be formulated as a single mathematical expression. Can I use such an objective function with tools like fminsearchbnd.m or fmincon.m at all? My results so far are not very promising...
There is an easy and obvious solution for that. You fminsearch() to find a minimum for some self-defined functions. In my example, it is fitting a polynomial, which of course is easy, but the trick is, that this could be anything. You can access the data if you make your objective function as a nested function, so they share the same variable scope.
You can start from the following code and fill in everything you want to do part by part and maybe ask followup questions, if any come up.
function main
verbose = 1; % some output
% optimize something, maybe a distorted polynomial
x = sort(rand(20,1));
p_original = [1.5, 3, 2, 1];
y = polyval(p_original,x) + 0.5*(rand(size(x))-0.5);
% optimize polynomial of order order. This is an example of how to pass
% a parameter to the fit function.
order = 3;
% obvious solution is this, but we want to do something else
p_polyfit = polyfit(x,y,order)
% we want to do it a bit more complex
pfit = optimize_something(x, y, order, verbose)
% what is happening?
figure
plot(x,polyval(p_original,x),'k-')
hold on
plot(x,y,'ko')
plot(x,polyval(p_polyfit,x),'rs-')
plot(x,fit_function(x,pfit),'gx-')
legend('original','noisy','polyfit','optimization')
end
function pfit = optimize_something(x,y, order, verbose)
% for polynomial of order order we need order+1 coefficients
p0 = ones(1,order+1); % initial guess: all coefficients are 1
if verbose
fprintf('optimize_something calling fminsearch(#objFun)\n');
end
% hand over only p0 to our objective function
pfit = fminsearch(#objFun, p0);
% ------------------------- NESTED objFUN --------------------------------%
function e = objFun(p)
% This function accepts only p as parameter and returns a value e, which
% will be minimized by some metric (maybe least squares).
% Since this function is nested, it can use also the predefined variables x, y (and also p0 and verbose).
% The magic is, we calculate a value yfitted out of x and p by a
% fit_function. This function can really be anything!
yfitted = fit_function(x, p);
e = sum((yfitted-y).^2);
% e = sum(abs(yfitted-y)); % another possibility
end
% ------------------------- NESTED objFUN --------------------------------%
if verbose
disp('pfit found')
end
end
function yfitted = fit_function(x, p)
% In our example we want to fit a polynomial, so we do so. We evaluate the
% polynomial p at x.
yfitted = polyval(p,x);
% But it could be anything, really.. each value in p could be something
% else, maybe the sum of an exponential function and a straight line
% yfitted = p(1)*exp(p(2)*x) + p(3)*x + p(4);
end
You can try to use CVX. It is an addon for Matlab that lets you describe your optimisation problem with normal Matlab code.
Alternatively, write down your objective function including any constraints. Your description is not clear to me, and it would help you too, if you would write this down in actual formulae.
I read your steps as this:
"Computing an L2-norm between an observed data point and a set of n training data points." It seems that there is a total of one (1) observed data points. Let's call the observed point x. Let's call the training data points y_i for i=1..n.
The L2-Norm is: |x-y_i|.
"Selecting those data points [multiple?] that give the lowest misfit". You haven't said how many data points you want, and how you'd combine multiple points to give a single L2-Norm. Let's assume you want exactly one such point (the closest to the observed data point x). Thus you get: argmin (over i) |x-y_i|. If you have multiple, you could greedily take the k closest points.
"Then using the row indices of this selected subset to compute the final distance that I intend to minimize." And what is the final distance that you intend to minimize?

matlab parfor leads to larger execution time than a for loop

I have a 3 dimensional grid, in which for each point of the grid I want to calculate a time dependent function G(t) for a large number of time steps and then summing the G function for each grid point. Using 4 for loops the execution time is becoming very large, so I am trying to avoid this using parfor.
a part of my code:
for i=1:50
for j=1:50
for k=1:25
x_in=i*dx;
y_in=j*dy;
z_in=k*dz;
%dx,dy, dz are some fixed values
r=sqrt((xx-x_in).^2+(yy-y_in).^2+(zz-z_in).^2);
%xx,yy,zz are 50x50x25 matrices generated from meshgrid
% r is a 3d matrix which produced from a 3 for-loop, for all the points of grid
parfor q=1:100
t=0.5*q;
G(q)=((a*p)/(t.^1.5)).*(exp(-r.^2/(4*a*t)));
% a,p are some fixed values
end
GG(i,j,k)=sum(G(:));
end
end
end
When I am using parfor the execution time becomes larger, and I am not sure why this is happening. Maybe I am not so familiar with sliced and indexed variables on a parfor loop.
My pc processor has 8 threads and ram memory ddr3 8GB
Any help will be great.
Thanks
As has been discussed in a previous question, parfor comes with an overhead. Therefore, loop that is too simple will execute more slowly with parfor.
In your case, the solution may be to parallelize the outermost loops.
%# preassign GG
GG = zeros(50,50,25);
%# loop over indices into GG
parfor idx = 1:(50*50*25)
[i,j,k] = ind2sub([50 50 25],idx);
x_in=i*dx;
y_in=j*dy;
z_in=k*dz;
%dx,dy, dz are some fixed values
r=sqrt((xx-x_in).^2+(yy-y_in).^2+(zz-z_in).^2);
%xx,yy,zz are 50x50x25 matrices generated from meshgrid
% r is a 3d matrix which produced from a 3 for-loop, for all the points of grid
for q=1:100
t=0.5*q;
G(q)=((a*p)/(t.^1.5)).*(exp(-r.^2/(4*a*t)));
% a,p are some fixed values
end
GG(idx)=sum(G(:));
end