Matlab fitting error using lsqcurvefit - matlab

I'm developing code to fit the Gompertz equation to a bacterial growth curve and am practicing with some example data provided at the following website:
http://www.math.tamu.edu/~phoward/m442/ia3sol.pdf.
According to this code the fit should almost match the data (graph given at above webpage, page 3). However when I run the code the actual data plots correctly but the lsqcurve fit fits very poorly and gives the following message:
Local minimum possible.
lsqcurvefit stopped because the size of the current step is less than
the default value of the step size tolerance.
Is there anything I am doing wrong?
Thank you for your time,
Laura

The problem lies in the linked document.
The Gompertz function is parametrized the following way:
%with parameters p(1) = K and p(2) = initial population
%p(3) = r.
V = p(1).*(p(2)/p(1)).^exp(-p(3)*t);
However, the initial parameters for the curve fitting are given for a different ordering of parameters in p vector ([r, K, p0] instead of [K, p0, r]). Moreover, the result vector is also messed up in the document.
By changing p0 to this [1000, 3.93, 0.01] the curve fitting will converge and you will get a nice fit:

Related

MATLAB: polyval function for N greater than 1

I am trying trying to graph the polynomial fit of a 2D dataset in Matlab.
This is what I tried:
rawTable = readtable('Test_data.xlsx','Sheet','Sheet1');
x = rawTable.A;
y = rawTable.B;
figure(1)
scatter(x,y)
c = polyfit(x,y,2);
y_fitted = polyval(c,x);
hold on
plot(x,y_fitted,'r','LineWidth',2)
rawTable.A and rawTable.A are randomly generated numbers. (i.e. the x dataset cannot be represented in the following form : x=0:0.1:100)
The result:
second-order polynomial
But the result I expect looks like this (generated in Excel):
enter image description here
How can I graph the second-order polynomial fit in MATLAB?
I sense some confusion regarding what the output of each of those Matlab function mean. So I'll clarify. And I think we need some details as well. So expect some verbosity. A quick answer, however, is available at the end.
c = polyfit(x,y,2) gives the coefficient vectors of the polynomial fit. You can get the fit information such as error estimate following the documentation.
Name this polynomial as P. P in Matlab is actually the function P=#(x)c(1)*x.^2+c(2)*x+c(3).
Suppose you have a single point X, then polyval(c,X) outputs the value of P(X). And if x is a vector, polyval(c,x) is a vector corresponding to [P(x(1)), P(x(2)),...].
Now that does not represent what the fit is. Just as a quick hack to see something visually, you can try plot(sort(x),polyval(c,sort(x)),'r','LineWidth',2), ie. you can first sort your data and try plotting on those x-values.
However, it is only a hack because a) your data set may be so irregularly spaced that the spline doesn't represent function or b) evaluating on the whole of your data set is unnecessary and inefficient.
The robust and 'standard' way to plot a 2D function of known analytical form in Matlab is as follows:
Define some evenly-spaced x-values over the interval you want to plot the function. For example, x=1:0.1:10. For example, x=linspace(0,1,100).
Evaluate the function on these x-values
Put the above two components into plot(). plot() can either plot the function as sampled points, or connect the points with automatic spline, which is the default.
(For step 1, quadrature is ambiguous but specific enough of a term to describe this process if you wish to communicate with a single word.)
So, instead of using the x in your original data set, you should do something like:
t=linspace(min(x),max(x),100);
plot(t,polyval(c,t),'r','LineWidth',2)

On Solving ODE equations and specifying number of samples

I am trying to understand the following set of equations given here: https://matlabgeeks.com/tips-tutorials/modeling-with-odes-in-matlab-part-5b/
The equations are those of a chaotic Lorenz system. The tutorial is quite easy to understand but what I do not follow is how to set the number of data points to generate i.e., the length of the time series? Which parameter helps to decide to generate how many data points will be generated. Can somebody please help? I have looked into other resources as well but I could not understand. For instance, by trial and error I found that if I specify
eps = 0.000001; T = [0 45] then the number of data points are about 7000. If I want the number of data points to 10,000 I don't know what the values of these parameters should be.
As described in the article (and the previous parts 1 and 2 of the series), the sequence of sample points is generated dynamically so that each segment contributes about the same amount of truncation error towards the global error, weighted by the absolute and relative tolerances. Additionally, it uses interpolation inside the segment to produce 3 inner points so that a plot will appear curved also for large tolerances. That is, the internal segmentation is given by T(1:4:end), the other points are interpolated.
You can also prescribe your own sample times, the values there get likewise interpolated from the "dense output", the interpolations over the internally produced segmentation.
T = linspace(t0, tend, 7000);
Y = ode45('lorenz', T, Y0, options);
You could also extract the dense output via
sol = ode45('lorenz', [t0 tend], Y0, options);
and then use the provided interpolation to compute samples at arbitrary times
Y = deval(sol,T);
In Empirical error proof Runge-Kutta algorithm ... I also computed the error for the Lorenz system for a fixed-step RK method, which shows the same divergence of the solutions after a relatively short time.

Curve fitting in MATLAB, for a Sinusoidal function with more than 8 terms?

I'm trying to fit some data to a sum of sines function in MATLAB, however, the number of terms of sine function in MATLAB is limited,i.e. to 1 ≤ n ≤ 8. However, I want more terms in my fit functions, i.e. over 50 term. Is there anyway to make MATLAB to fit my data to a sum of sine function with over 8 sinusoidal terms? Why there is such constraint in MATLAB (is it technically or arbitrary)? Is there any toolbox to fit sinusoidal function (especially something that is capable of supporting wieghted data)?
>f = fit(X,Y, 'sin10')
>Error using fittype>iCreateFromLibrary (line 412)
>Library function sin10 not found.
It is o.k up to 'sin8' or 'sin9' parameters.
I appreciate any answer.
I'v found a solution to my question accidentally, while browsing MATLAB help. I post this answer in hope of helping people who have the same problem.
As the first shot to solve this , I tried 'fit' instruction. For some reasons, customized 'fit' based fitting code like below, didn't workout:
FitOptions = fitoptions('Method','NonlinearLeastSquares', 'Algorithm', 'Trust-Region', 'MaxIter');
FitType = fittype('a*sin(1*f) + b*sin(2*f) + c*sin(3*f) + d*sin(4*f) + e*sin(5*f) + g*sin(6*f) + h*sin(7*f) + k*sin(8*f) + l*sin(9*f) + m*sin(10*f) + n*sin(11*f)', 'independent', 'f');
[FittedModel, GOF] = fit(freq, data, FitType)
% `In above code, phase parameters are not included, they might be added.
What I found is that using 'lsqcurvefit' instruction from Optimization Toolbox, customized function fitting is more feasible and easier than 'fit' function. I tested it to fit my data to sum of 12 (>8) sines in below code:
clear;clc
xdata=1:0.1:10; % X or Independant Data
ydata=sin(xdata+0.2)+0.5*sin(0.3*xdata+0.3)+ 2*sin( 0.2*xdata+23 )+...
0.7*sin( 0.34*xdata+12 )+.76*sin( .23*xdata+.3 )+.98*sin(.76 *xdata+.56 )+...
+.34*sin( .87*xdata+.123 )+.234*sin(.234 *xdata+23 ); % Y or Dependant data
x0 = randn(36,1); % Initial Guess
fun = #(x,xdata)x(1)*sin(x(2)*xdata+x(3))+...
x(4)*sin(x(5)*xdata+x(6))+...
x(7)*sin(x(8)*xdata+x(9))+...
x(10)*sin(x(11)*xdata+x(12))+...
x(13)*sin(x(14)*xdata+x(15))+...
x(16)*sin(x(17)*xdata+x(18))+...
x(19)*sin(x(20)*xdata+x(21))+...
x(22)*sin(x(23)*xdata+x(24))+...
x(25)*sin(x(26)*xdata+x(27))+...
x(28)*sin(x(29)*xdata+x(30))+...
x(31)*sin(x(32)*xdata+x(33))+...
x(34)*sin(x(35)*xdata+x(36)); % Goal function which is Sum of 12 sines
options = optimoptions('lsqcurvefit','Algorithm','trust-region-reflective');% Options for fitting
x=lsqcurvefit(fun,x0,xdata,ydata) % the main instruction
times = linspace(xdata(1),xdata(end));
plot(xdata,ydata,'ko',times,fun(x,times),'r-')
legend('Data','Fitted Sum of 12 Sines')
title('Data and Fitted Curve')
The results is satisfactory (till now), it is shown in below:
The above problem is that when I use matlab fit function, with specified argument for Sum of Sines fitting (e.g fit(xdata,ydata,'sin6')), it easily converges to an optimum solution and fitting results are acceptable as below:
but when I tried to fit same data using a customarily defined function, it results are not satisfactory at all as you see in figure below:
fun=#(x,xdata)a1*sin(b1*xdata+c1)+...+a6*sin(b6*xdata+c6); %Sum if Six Sines
f=fit(xdata,ydata,fun);
First, I felt it is the fit instruction so I tried other instructions like lsqcurvefit , it worked well for some data but as soon as other data were ued it started to ill-behave.
From Maltab documentations, I figured out Sum of Sine fitting and Fourier fitting are extremely sensitive to Starting points or initial points, or values that fitting algorithm assumes for fitting parameters (amplitudes, frequencies and phases) for its first iteration. Through inspection of Matlab fitting toolbox .m files , I noticed matlab does some clever trick to obtain starting point when you use predefined function fitting (e.g. fit(x,y,'sin1'), or fit(x,y,'sin2'),... but when you chose ti enter your custom function the initial points are generated randomly! This is why Matlab build functions work and my custom function fitting does not (even though I enter the same function).
By the way, Matlab computes FFT of the ydata and through some (seems greedy) method extracts initial points for amplitudes, frequencies and phases (a function called startpt.m does this).

the result does`t match what I expect when I used log-normal PDF in matlab

I`m learning a paper
the paper presents a figure
the figure shows CDF of buildings height
and the paper also gives details about this figure
Building height statistics: The present model uses the statistics of
building heights in typical built-up areas as input data. A suitable
form was sought by comparing with geographical data for the city of
Guildford, United Kingdom. The probability density function that was
selected to fit the data was the log-normal distribution with unknown
parameters: mean value p and standard deviation t. As can be noted
from Fig. 3, it was found to be a good fit to the geographical data
values with parameters p = 7.3m, t= 0.26.
it tells the mean value is 7.3 and the standard deviation is 0.26 right?
however, when I try them in matlab by adding codes
x=0:0.01:20;
meanValue = 7.3;
standardDeviation = 0.26;
y1 = logncdf(x,meanValue,standardDeviation);
plot(x,y1);
what the result showed is different from the figure 3
I tried to re-read the paper to make sure parameters are correct.
and check the document on matlab about how to use this method.
everything seem all right except the simulation result.
please help me fix it ! thanks
As mentioned in the comments, the parameters mu and sigma are the mean and standard derivation of the associated normal distribution, not of the log normal distribution. The details, especially the connection between both is explained in the Wikipedia article.
To calculate mu and sigma from the mean and variance, the formulas are given in the Wikipedia article or here in the matlab syntax:
m=7.3
t=0.26
v=t.^2;
%A lognormal distribution with mean m and variance v has parameters
mu = log((m^2)/sqrt(v+m^2));
sigma = sqrt(log(v/(m^2)+1));
%finally your code:
x=0:0.01:20;
y1 = logncdf(x,mu,sigma);
plot(x,y1);
Which is much closer to the graph in your question, but the graph in your question seems to be the CDF for a much higher standard derivation. Visually guessing the parameters form your plot, I would say it's roughly t=5

Differentiating a Centred and Scaled Polyfit Fit

I have some data which I wish to model in order to be able to get relatively accurate values in the same range as the data.
To do this I used polyfit to fit a 6th order polynomial and due to my x-axis values it suggested I centred and scaled it to get a more accurate fit which I did.
However, now I want to find the derivative of this function in order to model the velocity of my model.
But I am not sure how the polyder function interacts with the scaled and fitted polyfit which I have produced. (I don't want to use the unscaled model as this is not very accurate).
Here is some code which reproduces my problem. I attempted to rescale the x values before putting them into the fit for the derivative but this still did no fix the problem.
x = 0:100;
y = 2*x.^2 + x + 1;
Fit = polyfit(x,y,2);
[ScaledFit,s,mu] = polyfit(x,y,2);
Deriv = polyder(Fit);
ScaledDeriv = polyder(ScaledFit);
plot(x,polyval(Deriv,x),'b.');
hold on
plot(x,polyval(ScaledDeriv,(x-mu(1))/mu(2)),'r.');
Here I have chosen a simple polynomial so that I could fit it accurate and produce the actual derivative.
Any help would be greatly appreciated thanks.
I am using Matlab R2014a BTW.
Edit.
Just been playing about with it and by dividing the resulting points for the differential by the standard deviation mu(2) it gave a very close result within the range -3e-13 to about 5e-13.
polyval(ScaledDeriv,(x-mu(1))/mu(2))/mu(2);
Not sure quite why this is the case, is there another more elegant way to solve this?
Edit2. Sorry for another edit but again was mucking around and found that for a large sample x = 1:1000; the deviation became much bigger up to 10. I am not sure if this is due to a bad polyfit even though it is centred and scaled or due to the funny way the derivative is plotted.
Thanks for your time
A simple application of the chain rule gives
Since by definition
it follows that
Which is exactly what you have verified numerically.
The lack of accuracy for large samples is due to the global, rather then local, Lagrange polynomial interpolation which you have done. I would suggest that you try to fit your data with splines, and obtain the derivative with fnder(). Another option is to apply the polyfit() function locally, i.e. to a moving small set of points, and then apply polyder() to all the fitted polynomials.