Quickly Evaluating MANY matlabFunctions - matlab

This post builds on my post about quickly evaluating analytic Jacobian in Matlab:
fast evaluation of analytical jacobian in MATLAB
The key difference is that now, I am working with the Hessian and I have to evaluate close to 700 matlabFunctions (instead of 1 matlabFunction, like I did for the Jacobian) each time the hessian is evaluated. So there is an opportunity to do things a little differently.
I have tried to do this two ways so far and I am thinking about implementing a third and was wondering if anyone has any other suggestions. I will go through each method with a toy example, but first some preprocessing to generate these matlabFunctions:
PreProcessing:
% This part of the code is calculated once, it is not the issue
dvs = 5;
X=sym('X',[dvs,1]);
num = dvs - 1; % number of constraints
% multiple functions
for k = 1:num
f1(X(k+1),X(k)) = (X(k+1)^3 - X(k)^2*k^2);
c(k) = f1;
end
gradc = jacobian(c,X).'; % .' performs transpose
parfor k = 1:num
hessc{k} = jacobian(gradc(:,k),X);
end
parfor k = 1:num
hess_name = strcat('hessian_',num2str(k));
matlabFunction(hessc{k},'file',hess_name,'vars',X);
end
METHOD #1 : Evaluate functions in series
%% Now we use the functions to run an "optimization." Just for an example the "optimization" is just a for loop
fprintf('This is test A, where the functions are evaluated in series!\n');
tic
for q = 1:10
x_dv = rand(dvs,1); % these are the design variables
lambda = rand(num,1); % these are the lagrange multipliers
x_dv_cell = num2cell(x_dv); % for passing large design variables
for k = 1:num
hess_name = strcat('hessian_',num2str(k));
function_handle = str2func(hess_name);
H_temp(:,:,k) = lambda(k)*function_handle(x_dv_cell{:});
end
H = sum(H_temp,3);
end
fprintf('The time for test A was:\n')
toc
METHOD # 2: Evaluate functions in parallel
%% Try to run a parfor loop
fprintf('This is test B, where the functions are evaluated in parallel!\n');
tic
for q = 1:10
x_dv = rand(dvs,1); % these are the design variables
lambda = rand(num,1); % these are the lagrange multipliers
x_dv_cell = num2cell(x_dv); % for passing large design variables
parfor k = 1:num
hess_name = strcat('hessian_',num2str(k));
function_handle = str2func(hess_name);
H_temp(:,:,k) = lambda(k)*function_handle(x_dv_cell{:});
end
H = sum(H_temp,3);
end
fprintf('The time for test B was:\n')
toc
RESULTS:
METHOD #1 = 0.008691 seconds
METHOD #2 = 0.464786 seconds
DISCUSSION of RESULTS
This result makes sense because, the functions evaluate very quickly and running them in parallel waists a lot of time setting up and sending out the jobs to the different Matlabs ( and then getting the data back from them). I see the same result on my actual problem.
METHOD # 3: Evaluating the functions using the GPU
I have not tried this yet, but I am interested to see what the performance difference is. I am not yet familiar with doing this in Matlab and will add it once I am done.
Any other thoughts? Comments? Thanks!

Related

Matlab simulation error

I am completely new to Matlab. I am trying to simulate a Wiener and Poisson combined process.
Why do I get Subscripted assignment dimension mismatch?
I am trying to simulate
Z(t)=lambda*W^2(t)-N(t)
Where W is a wiener process and N is a poisson process.
The code I am using is below:
T=500
dt=1
K=T/dt
W(1)=0
lambda=3
t=0:dt:T
for k=1:K
r=randn
W(k+1)=W(k)+sqrt(dt)*r
N=poissrnd(lambda*dt,1,k)
Z(k)=lambda*W.^2-N
end
plot(t,Z)
It is true that some indexing is missing, but I think you would benefit from rewriting your code in a more 'Matlab way'. The following code is using the fact that Matlab basic variables are matrices, and compute the results in a vectorized way. Try to understand this kind of writing, as this is the way to exploit Matlab more efficiently, along with writing shorter and readable code:
T = 500;
dt = 1;
K = T/dt;
lambda = 3;
t = 1:dt:T;
sqdtr = sqrt(dt)*randn(K-1,1); % define sqrt(dt)*r as a vector
N = poissrnd(lambda*dt,K,1); % define N as a vector
W = cumsum([0; sqdtr],1); % cumulative sum instead of the loop
Z = lambda*W.^2-N; % summing the processes element-wiesly
plot(t,Z)
Example for a result:
you forget index
Z(k)=lambda*W.^2-N
it must be
Z(k)=lambda*W(k).^2-N(k)

How to accumulate submatrices without looping (subarray smoothing)?

In Matlab I need to accumulate overlapping diagonal blocks of a large matrix. The sample code is given below.
Since this piece of code needs to run several times, it consumes a lot of resources. The process is used in array signal processing for a so-called subarray smoothing or spatial smoothing. Is there any way to do this faster?
% some values for parameters
M = 1000; % size of array
m = 400; % size of subarray
n = M-m+1; % number of subarrays
R = randn(M)+1i*rand(M);
% main code
S = R(1:m,1:m);
for i = 2:n
S = S + R(i:m+i-1,i:m+i-1);
end
ATTEMPTS:
1) I tried the following alternative vectorized version, but unfortunately it became much slower!
[X,Y] = meshgrid(1:m);
inds1 = sub2ind([M,M],Y(:),X(:));
steps = (0:n-1)*(M+1);
inds = repmat(inds1,1,n) + repmat(steps,m^2,1);
RR = sum(R(inds),2);
S = reshape(RR,m,m);
2) I used Matlab coder to create a MEX file and it became much slower!
I've personally had to fasten up some portions of my code lately. Being not an expert at all, I would recommend trying the following:
1) Vectorize:
Getting rid of the for-loop
S = R(1:m,1:m);
for i = 2:n
S = S + R(i:m+i-1,i:m+i-1)
end
and replacing it for an alternative based on cumsum should be the way to go here.
Note: will try and work on this approach on a future Edit
2) Generating a MEX-file:
In some instances, you could simply fire up the Matlab Coder app (given that you have it in your current Matlab version).
This should generate a .mex file for you, that you can call as it was the function that you are trying to replace.
Regardless of your choice (1) or 2)), you should profile your current implementation with tic; my_function(); toc; for a fair number of function calls, and compare it with your current implementation:
my_time = zeros(1,10000);
for count = 1:10000
tic;
my_function();
my_time(count) = toc;
end
mean(my_time)

Determining convergence rate of fsolve - Matlab

Suppose I'm solving a system of nonlinear equations. A simple example would be:
function example
x0 = [15; -2];
options = optimoptions('fsolve','Display','iter','TolFun',eps,'TolX',eps);
[x,fval,exitflag,output] = fsolve(#P1a,x0,options);
end
function f1 = P1a(x)
f1 = [x(1)+x(2)*(x(2)*(5-x(2))-2)- 13; x(1)+x(2)*(x(2)*(1+x(2))-14)-29];
end
How do I determine the rate of convergence? 'Display','iter' shows me the norm at each step, but I can't find a way to extract these values. (For this particular example, I believe fsolve does not converge to the right solution, but rather to a local minimum. That is not the issue, however. I just want to find a way to estimate the convergence rate.)
You can get plenty out of fsolve. However, you'll need to do some work. Read up on the 'OutputFcn' option and writing output functions for Matlab's Optimization methods. This is vary similar to the option of the same name used by Matlab's ODE solvers. Here's an example that replicates the 'Display','iter' option-value for fsolve (for the default 'trust-region-dogleg' algorithm specifically):
function stop = outfun(x,optimValues,state)
% See private/trustnleqn
stop = false;
switch state
case 'init'
header = sprintf(['\n Norm of First-order Trust-region\n',...
' Iteration Func-count f(x) step optimality radius']);
disp(header);
case 'iter'
iter = optimValues.iteration; % Iteration
numFevals = optimValues.funccount; % Func-count
F = optimValues.fval; % f(x)
normd = optimValues.stepsize; % Norm of step
normgradinf = optimValues.firstorderopt; % First-order optimality
Delta = optimValues.trustregionradius; % Trust-region radius
if iter > 0
formatstr = ' %5.0f %5.0f %13.6g %13.6g %12.3g %12.3g';
iterOutput = sprintf(formatstr,iter,numFevals,F'*F,normd,normgradinf,Delta);
else
formatstr0 = ' %5.0f %5.0f %13.6g %12.3g %12.3g';
iterOutput = sprintf(formatstr0,iter,numFevals,F'*F,normgradinf,Delta);
end
disp(iterOutput);
case 'done'
otherwise
end
You can then call this via:
function example
P1a=#(x)[x(1)+x(2)*(x(2)*(5-x(2))-2)- 13; x(1)+x(2)*(x(2)*(1+x(2))-14)-29];
x0 = [15; -2];
opts = optimoptions('fsolve','Display','off','OutputFcn',#outfun,'TolFun',eps,'TolX',eps);
[x,fval,exitflag,output] = fsolve(P1a,x0,opts);
This still just prints to the Command Window. From here it's a matter of creating an output function that can write data to an array, file, or other data structure. Here's how you might do that with a global variable (in general, not a good idea):
function stop = outfun2(x,optimValues,state)
stop = false;
global out; % Global variable, define in main function too
switch state
case 'init'
out = [];
case 'iter'
iter = optimValues.iteration; % Iteration
numFevals = optimValues.funccount; % Func-count
F = optimValues.fval; % f(x)
normd = optimValues.stepsize; % Norm of step
normgradinf = optimValues.firstorderopt; % First-order optimality
Delta = optimValues.trustregionradius; % Trust-region radius
out = [out;iter numFevals F'*F normd normgradinf Delta];
case 'done'
otherwise
end
Then just declare global out; in your main function before calling fsolve. You could also accomplish this by making your output function a nested function, in which case the out array would be shared with the outer main function.
The second output function example performs dynamic memory allocation instead of reallocating the entire out array. There's no way around this because both we and the algorithm don't know how many iterations it will take to converge. However, for a few hundred iterations, dynamic memory allocation will be plenty fast.
I'll leave "determining the rate of convergence" to you now that you have the tools in hand...

How to use MATLAB to numerically solve equation with unknown embedded in integral?

I've been trying to use MATLAB to solve equations like this:
B = alpha*Y0*sqrt(epsilon)/(pi*ln(b/a)*sqrt(epsilon_t))*integral from
0 to pi of
(2*sinint(k0*sqrt(epsilon*(a^2+b^2-2abcos(theta))-sinint(2*k0*sqrt(epsilon)*a*sin(theta/2))-sinint(2*k0*sqrt(epsilon)*b*sin(theta/2)))
with regard to theta
Where epsilon is the unknown.
I know how to symbolically solve equations with unknown embedded in an integral by using int() and solve(), but using the symbolic integrator int() takes too long for equations this complicated. When I try to use quad(), quadl() and quadgk(), I have trouble dealing with how the unknown is embedded in the integral.
This sort of thing gets complicated real fast. Although it is possible to do it all in a single inline equation, I would advise you to split it up into multiple nested functions, if only for readability.
The best example of why readability is important: you have a bracketing problem in the eqution you posted; there's not enough closing brackets, so I can't be entirely sure what the equation looks like in mathematical notation :)
Anyway, here's one way to do it with the version I --think-- you meant:
function test
% some random values for testing
Y0 = rand;
b = rand;
a = rand;
k0 = rand;
alpha = rand;
epsilon_t = rand;
% D is your B
D = -0.015;
% define SIMPLE anonymous function
Bb = #(ep) F(ep).*main_integral(ep) - D;
% aaaand...solve it!
sol = fsolve(Bb, 1)
% The anonymous function above is only simple, because of these:
% the main integral
function val = main_integral(epsilon)
% we need for loop through epsilon, due to how quad(gk) solves things
val = zeros(size(epsilon));
for ii = 1:numel(epsilon)
ep = epsilon(ii);
% NOTE how the sinint's all have a different function as argument:
val(ii) = quadgk(#(th)...
2*sinint(A(ep,th)) - sinint(B(ep,th)) - sinint(C(ep,th)), ...
0, pi);
end
end
% factor in front of integral
function f = F(epsilon)
f = alpha*Y0*sqrt(epsilon)./(pi*log(b/a)*sqrt(epsilon_t)); end
% first sinint argument
function val = A(epsilon, theta)
val = k0*sqrt(epsilon*(a^2+b^2-2*a*b*cos(theta))); end
% second sinint argument
function val = B(epsilon, theta)
val = 2*k0*sqrt(epsilon)*a*sin(theta/2); end
% third sinint argument
function val = C(epsilon, theta)
val = 2*k0*sqrt(epsilon)*b*sin(theta/2); end
end
The solution above will still be quite slow, but I think that's pretty normal for integrals this complicated.
I don't think implementing your own sinint will help much, as most of the speed loss is due to the for loops with non-builtin functions...If it's speed you want, I'd go for a MEX implementation with your own Gauss-Kronrod adaptive quadrature routine.

Matlab: Optimizing speed of function call and cosine sign-finding in a loop

The code in question is here:
function k = whileloop(odefun,args)
...
while (sign(costheta) == originalsign)
y=y(:) + odefun(0,y(:),vars,param)*(dt); % Line 4
costheta = dot(y-normpt,normvec);
k = k + 1;
end
...
end
and to clarify, odefun is F1.m, an m-file of mine. I pass it into the function that contains this while-loop. It's something like whileloop(#F1,args). Line 4 in the code-block above is the Euler method.
The reason I'm using a while-loop is because I want to trigger upon the vector "y" crossing a plane defined by a point, "normpt", and the vector normal to the plane, "normvec".
Is there an easy change to this code that will speed it up dramatically? Should I attempt learning how to make mex files instead (for a speed increase)?
Edit:
Here is a rushed attempt at an example of what one could try to test with. I have not debugged this. It is to give you an idea:
%Save the following 3 lines in an m-file named "F1.m"
function ydot = F1(placeholder1,y,placeholder2,placeholder3)
ydot = y/10;
end
%Run the following:
dt = 1.5e-12 %I do not know about this. You will have to experiment.
y0 = [.1,.1,.1];
normpt = [3,3,3];
normvec = [1,1,1];
originalsign = sign(dot(y0-normpt,normvec));
costheta = originalsign;
y = y0;
k = 0;
while (sign(costheta) == originalsign)
y=y(:) + F1(0,y(:),0,0)*(dt); % Line 4
costheta = dot(y-normpt,normvec);
k = k + 1;
end
disp(k);
dt should be sufficiently small that it takes hundreds of thousands of iterations to trigger.
Assume I must use the Euler method. I have a stochastic differential equation with state-dependent noise if you are curious as to why I tell you to take such an assumption.
I would focus on your actual ODE integration. The fewer steps you have to take, the faster the loop will run. I would only worry about the speed of the sign check after you've optimized the actual integration method.
It looks like you're using the first-order explicit Euler method. Have you tried a higher-order integrator or an implicit method? Often you can increase the time step significantly.