Local minimum at initial point when fitting gaussian with `lsqcurvefit` - matlab

I am writing MATLAB code with the intention to do some fittings. I simulated a plot using a second-order Gaussian (see my code below) and tried fitting using the lsqcurvefit function. Unfortunately, MATLAB returns the same guess values as 'optimized' parameters and apparently gets stuck in a local minimum.
Can someone please provide some advice as to what might be wrong here? I know that if the guess is far from the 'true' values then this can happen, but I expected MATLAB to return an answer which was closer to true value. Improving the initial guess to [29,0] (which is much closer to the actual value) results in the same output: that the initial value is a local minimum.
%%%%%%%%%%
function z= testfunction(x, xdata);
sigma=x(1)/(2*sqrt(2*log(2)));
z=((xdata.^2-2*x(2)*xdata-sigma.^2+x(2)^2)./(sigma^5*sqrt(2*pi))).*exp(-(xdata-x(2)).^2/(2.*sigma.^2));
end
%%%%%%%%
% Simulate Data
xdata= -50:1:50;
ydata = testfunction([30,0],xdata);
% Fit Data
xfit = lsqcurvefit(#testfunction,[19,-4],xdata, ydata );
xfit(1)
xfit(2)
yfit=testfunction([xfit(1),xfit(2)],xdata);
% Plot Data;
plot(xdata,yfit);
hold on;
plot(xdata,ydata,'o')
Output:
Initial point is a local minimum.
Optimization completed because the size of the gradient at the initial point
is less than the default value of the optimality tolerance.
<stopping criteria details>
ans =
19
ans =
-4

Short answer: check the stopping criteria details and change the stopping criteria accordingly:
options = optimoptions('lsqcurvefit','OptimalityTolerance', 1e-16, 'FunctionTolerance', 1e-16);
xfit = lsqcurvefit(#testfunction,[19,-4],xdata, ydata, [], [], options);
What is the problem?
lsqcurvefit is a numerical solver and therefore uses a stopping criteria to determine if the local minimum is sufficiently reached. In general, you never reach the exact solution. Therefore, the solution to your problem is to change the stopping criteria to request a more accurate solution.
How to solve it?
By clicking on stopping criteria details, you receive the following explanation:
Optimization completed: The final point is the initial point. The
first-order optimality measure, 7.254593e-07, is less than
options.OptimalityTolerance = 1.000000e-06.
Optimization Metric Options
relative first-order optimality = 7.25e-07 OptimalityTolerance = 1e-06 (default)
So, you should decrease the OptimalityTolerance (f.e. to 1e-16):
options = optimoptions('lsqcurvefit','OptimalityTolerance', 1e-16);
xfit = lsqcurvefit(#testfunction,[19,-4],xdata, ydata, [], [], options);
The above image visualizes the new result, which is better than the previous one, but not yet very good. By inspecting the stopping criteria again, you will see that you also need to change FunctionTolerance:
options = optimoptions('lsqcurvefit','OptimalityTolerance', 1e-16, 'FunctionTolerance', 1e-16);
Why are the default options that bad?
Note that you need to tweak the stopping criteria, because your function returns relative small values. Multiplying z with a factor of 1000, without any option specification would also result in a nice fit:

Related

solving over determined non-linear equation in matlab

Actually I have to calculate values of 3 variables from probably 8 or 9 non-linear equations(may be more for accuracy).
I was using lsqnonlin and fsolve.
Using lsqnonlin, it says solver stopped prematurely (mainly due to value of iteration, funEvals and tolerance) and the output is far away from exact solution. I tried but I don't know on what basis I should set those parameters.
Using fsolve, it says no solution found.
I also used LMFnlsq and LMFsolve but it gives the output nowhere near the exact solution? I tried to change other parameters too but I could not bring those solutions to my desired values.
Is there any other way to solve these overdetermined non-linear equations?
My code till now:
x0 = [20 40 275];
eqn = #(x)[((((x(1)-Sat(1,1))^2+(x(2)-Sat(1,2))^2+(x(3)-Sat(1,3))^2))-dis(1)^2);
((((x(1)-Sat(2,1))^2+(x(2)-Sat(2,2))^2+(x(3)-Sat(2,3))^2))-dis(2)^2);
((((x(1)-Sat(3,1))^2+(x(2)-Sat(3,2))^2+(x(3)-Sat(3,3))^2))- dis(3)^2);
((((x(1)-Sat(4,1))^2+(x(2)-Sat(4,2))^2+(x(3)-Sat(4,3))^2))- dis(4))^2;
((((x(1)-Sat(5,1))^2+(x(2)-Sat(5,2))^2+(x(3)-Sat(5,3))^2))- dis(5))^2;
((((x(1)-Sat(6,1))^2+(x(2)-Sat(6,2))^2+(x(3)-Sat(6,3))^2))- dis(6))^2;
((((x(1)-Sat(7,1))^2+(x(2)-Sat(7,2))^2+(x(3)-Sat(7,3))^2))- dis(7))^2;
((((x(1)-Sat(8,1))^2+(x(2)-Sat(8,2))^2+(x(3)-Sat(8,3))^2))- dis(8))^2;
((((x(1)-Sat(9,1))^2+(x(2)-Sat(9,2))^2+(x(3)-Sat(9,3))^2))- dis(9))^2;
((((x(1)-Sat(10,1))^2+(x(2)-Sat(10,2))^2+(x(3)-Sat(10,3))^2))- dis(10))^2];
lb = [0 0 0];
ub = [100 100 10000];
options = optimoptions('lsqnonlin','MaxFunEvals',3000,'MaxIter',700,'TolFun',1e-18);%,'TolX',1);
x= lsqnonlin(eqn,x0,lb,ub,options)
**Error:**
**Solver stopped prematurely.**
lsqnonlin stopped because it exceeded the iteration limit,
options.MaxIter = 700 (the selected value).
x = 20.349 46.633 9561.5
Hoping for some suggestions!
Thanks in advance!
I usually model this explicitly:
min w'w
f_i(x) = w_i
w is a free variable
L<=x<=U
It should be easy to calculate a feasible (but non-optimal) solution in advance. If you can find a "good" initial solution that would be even better. Then use a general purpose NLP solver (e.g. fmincon) and pass on your initial feasible solution (both x and w). The best thing is to use a modeling system that allows automatic differentiation. Otherwise you should provide correct and precise gradients (and if needed second derivatives). See also the advice here.

Interactive curve fitting with MATLAB using custom GUI?

I find examples the best way to demonstrate my question. I generate some data, add some random noise, and fit it to get back my chosen "generator" value...
x = linspace(0.01,1,50);
value = 3.82;
y = exp(-value.*x);
y = awgn(y,30);
options = optimset('MaxFunEvals',1000,'MaxIter',1000,'TolFun',1e-10,'Display','off');
model = #(p,x) exp(-p(1).*x);
startingVals = [5];
lb = [1];
ub = [10];
[fittedValue] = lsqcurvefit(model,startingVals,x,y,lb,ub,options)
fittedGraph = exp(-fittedValue.*x);
plot(x,y,'o');
hold on
plot(x,fittedGraph,'r-');
In this new example, I have generated the same data but this time added much more noise to the first 15 points. Because it is random sometimes it works out okay, but after a few runs I get a good example that illustrates my problem. Same code, except for these lines added under value = 3.82
y = exp(-value.*x);
y(1:15) = awgn(y(1:15),5);
y(15:end) = awgn(y(15:end),30);
As you can see, it has clearly not given a good fit to where the data seems reliable, because it is fitting from points 1-50. What I want to do is say, okay MATLAB, I can see we have some noisy data but it seems decent over a range, only fit your exponential from points 15 to the end. I could go back to my code and update it to do this, but I will be batch fitting graphs like this where each one will have different ranges of 'good' data.
So what I am after is a GUI callback mechanisms that allows me to click on two circles from the data and have them change color or something, which indicates the lsqcurvefit will only fit over that range. Internally all it has to change is inside the lsqcurvefit call e.g.
x(16:end),y(16:end)
But the range should update depending on the starting and ending circles I have clicked.
I hope my question is clear. Thanks.
You could use ginput to select the two points for your min and max in the plot.
[x,y]=ginput(2);
%this returns you the x and y coordinates of two points clicked after each other
%the min point is assumed to be clicked first
min=[x(1) y(1)];
max=[x(2) y(2)];
then you could fit your curve with the coordinates for min and max I guess.
You could also switch between a rightclick for the min and a leftclick for the max etc.
Hope this helps you.

Matlab Kolmogorov-Smirnov Test

I'm using MATLAB to analyze some neuroscience data, and I made an interspike interval distribution and fit an exponential to it. Then, I wanted to check this fit using a Kolmogorov-Smirnov test with MATLAB.
The data for the neuron spikes is just stored in a vector of spikes. The spikes vector is a 111 by 1 vector, where each entry is another vector. Each entry in thie spikes vector represents a trial. The number of spikes in each trial varies. For example, spikes{1} is a [1x116 double], meaning there are 116 spikes. The next has 115 spikes, then 108, etc.
Now, I understand that the kstest in MATLAB takes a couple of parameters. You enter the data in the first one, so I took all the interspike intervals and created a row vector alldiffs which stores all the interspike intervals. I want to set my CDF to that for an exponential function fit:
test_cdf = [transpose(alldiffs), transpose(1-exp(-alldiffs*firingrate))];
Note that the theoretical exponential (with which I fit the data) is r*exp(-rt) where r is the firing rate. I get a firing rate of about 0.2. Now, when I put this all together, I run the kstest:
[h,p] = kstest(alldiffs, 'CDF', test_cdf)
However, the result is a p value on the order of 1.4455e-126. I've tried redoing the test_cdf with another of the methods on Mathworks' website documentation:
test_cdf = [transpose(alldiffs), cdf('exp', transpose(alldiffs), 1/firingrate)];
This gives the exact same result! Is the fit just horrible? I don't know why I get such low p-values. Please help!
I would post an image of the fit, but I don't have enough reputation.
P.S. If there is a better place to post this, let me know and I'll repost.
Here is an example with fake data and yet another way to create the CDF:
>> data = exprnd(.2, 100);
>> test_cdf = makedist('exp', 'mu', .2);
>> [h, p] = kstest(data, 'CDF', test_cdf)
h =
0
p =
0.3418
However, why are you doing a KS Test?
All models are wrong, some are useful.
No neuron is perfectly a Poisson process and with enough data, you'll always have a significantly non-exponential ISI, as measured by a KS test. That doesn't mean you can't make the simplifying assumption of an exponential ISI, depending on what phenomena you're trying model.

How to set the stopping criterion of Matlab's fmincon to "output.stepsize" when "active-set" algorithm is used

How can the output.stepsize be used as a stopping criterion for fmincon when using the 'active-set' algorithm?
When I tried to solve a nonlinear constrained non-convex optimization problem, I observed that the output.stepsize option within 'PlotFcns' would be a better stopping criterion than the tolerances or maximum evaluations. But in the options argument structure, there is no such option.
I also noticed that fmincon uses nonlcon to solve the problem where output.stepsize is calcualted.
If I don't want to change the original code of either fmincon or nonlcon, how can I set an upper limit for output.stepsize to use as a stopping criterion for my optimization run?
Here's one way that you might try to accomplish this. I'm using one of the examples from the doc (there's also discussion of how to do exactly this there as well). I've simply added an output function that returns stop = true when the optimValues.stepsize field reaches a threshold. I also had to make sure that this field wasn't empty, as it appears to be at initialization.
function fmincontest
A = [-1 -2 -2;
1 2 2];
b = [0;72];
x0 = [10;10;10];
options = optimoptions('fmincon','Algorithm','active-set','OutputFcn',#outfun);
[x,fval,exitflag] = fmincon(#myfun,x0,A,b,[],[],[],[],[],options)
function f = myfun(x)
f = -x(1)*x(2)*x(3);
function stop = outfun(x, optimValues, state)
stop = ~isempty(optimValues.stepsize) && optimValues.stepsize < 0.05;
If you check the exit_flag output from fmincon, you'll see that it returns -1 now because the output function is stopping the optimization, again as indicated in the documentation. If you're already using a plot function, you can use that instead, as they have the same format.
You'll need to tailor this to your problem, adjust the threshold, and of course confirm that it works. I can't comment on how good of an idea this is. You should confirm that your error tolerances are still met satisfactorily for all cases. And I'd still want to ask why you can't specify appropriate 'TolCon' and 'TolX' tolerances to achieve what you need.

Matlab: if statements and abs() function in variable-step ODE solvers

I was reading this post online where the person mentioned that using "if statements" and "abs()" functions can have negative repercussions in MATLAB's variable-step ODE solvers (like ODE45). According to the OP, it can significantly affect time-step (requiring too low of a time step) and give poor results when the differential equations are finally integrated. I was wondering whether this is true, and if so, why. Also, how can this problem be mitigated without resorting to fix-step solvers. I've given an example code below as to what I mean:
function [Z,Y] = sauters(We,Re,rhos,nu_G,Uinj,Dinj,theta,ts,SMDs0,Uzs0,...
Uts0,Vzs0,zspan,K)
Y0 = [SMDs0;Uzs0;Uts0;Vzs0]; %Initial Conditions
options = odeset('RelTol',1e-7,'AbsTol',1e-7); %Tolerance Levels
[Z,Y] = ode45(#func,zspan,Y0,options);
function DY = func(z,y)
DY = zeros(4,1);
%Calculate Local Droplet Reynolds Numbers
Rez = y(1)*abs(y(2)-y(4))*Dinj*Uinj/nu_G;
Ret = y(1)*abs(y(3))*Dinj*Uinj/nu_G;
%Calculate Droplet Drag Coefficient
Cdz = dragcof(Rez);
Cdt = dragcof(Ret);
%Calculate Total Relative Velocity and Droplet Reynolds Number
Utot = sqrt((y(2)-y(4))^2 + y(3)^2);
Red = y(1)*abs(Utot)*Dinj*Uinj/nu_G;
%Calculate Derivatives
%SMD
if(Red > 1)
DY(1) = -(We/8)*rhos*y(1)*(Utot*Utot/y(2))*(Cdz*(y(2)-y(4)) + ...
Cdt*y(3)) + (We/6)*y(1)*y(1)*(y(2)*DY(2) + y(3)*DY(3)) + ...
(We/Re)*K*(Red^0.5)*Utot*Utot/y(2);
elseif(Red < 1)
DY(1) = -(We/8)*rhos*y(1)*(Utot*Utot/y(2))*(Cdz*(y(2)-y(4)) + ...
Cdt*y(3)) + (We/6)*y(1)*y(1)*(y(2)*DY(2) + y(3)*DY(3)) + ...
(We/Re)*K*(Red)*Utot*Utot/y(2);
end
%Axial Droplet Velocity
DY(2) = -(3/4)*rhos*(Cdz/y(1))*Utot*(1 - y(4)/y(2));
%Tangential Droplet Velocity
DY(3) = -(3/4)*rhos*(Cdt/y(1))*Utot*(y(3)/y(2));
%Axial Gas Velocity
DY(4) = (3/8)*((ts - ts^2)/(z^2))*(cos(theta)/(tan(theta)^2))*...
(Cdz/y(1))*(Utot/y(4))*(1 - y(4)/y(2)) - y(4)/z;
end
end
Where the function "dragcof" is given by the following:
function Cd = dragcof(Re)
if(Re <= 0.01)
Cd = (0.1875) + (24.0/Re);
elseif(Re > 0.01 && Re <= 260.0)
Cd = (24.0/Re)*(1.0 + 0.1315*Re^(0.32 - 0.05*log10(Re)));
else
Cd = (24.0/Re)*(1.0 + 0.1935*Re^0.6305);
end
end
This is because derivatives that are computed using if-statements, modulus operations (abs()), or things like unit step functions, dirac delta's, etc., will introduce discontinuities in the value of the solution or its derivative(s), resulting in kinks, jumps, inflection points, etc.
This implies the solution to the ODE has a complete change in behavior at the relevant times. What variable step integrators will do is
detect this
recognize that they won't be able to use information directly beyond the "problem point"
decrease the step, and repeat from the top, until the problem point satisfies the accuracy demands
Therefore, there will be many failed steps and reductions in step size near the problem points, negatively affecting the overall integration time.
Variable step integrators will continue to produce good results, however. Constant step integrators are not a good remedy for this sort of problem, since they are not able to detect such problems in the first place (there's no error estimation).
What you could do is simply split the problem up in multiple parts. If you know beforehand at what points in time the changes will occur, you just start a new integration for each interval, each time using the output of the previous integration as the initial value for the next one.
If you don't know beforehand where the problems will be, you could use this very nice feature in Matlab's ODE solvers called event functions (see the documentation). You let one of Matlab's solvers detect the event (change of sign in the derivative, change of condition in the if-statement, or whatever), and terminate the integration when such events are detected. Then start a new integration, starting from the last time and with initial conditions of the previous integration, as before, until the final time is reached.
There will still be a slight penalty in overall execution time this way, since Matlab will try to detect the location of the event accurately. However, it is still much better than running the integration blindly when it comes to both execution time and accuracy of the results.
Yes it is true and it happens because of your solution is not smooth enough at some points.
Assume you want to integrate. y'(t) = f(t,y). Then, what happens in f is getting integrated to become y. Thus, if in your definition of f there is
abs(), then f has a kink and y is still smooth and 1 times differentiable
if, then f has a jump and y a kink and no more differentiability
Matlab's ODE45 presumes that your solution is 5 times differentiable, and tries to ensure an accuracy of order 4. Nonsmooth points of your function are misinterpreted as stiffness what leads to small stepsizes and even to breakdowns.
What you can do: Because of the lack of smoothness you cannot expect a high accuracy anyways. Thus, ODE23 might be a better choice. In the worst case, you have to stick to first-order schemes.