On Solving ODE equations and specifying number of samples - matlab

I am trying to understand the following set of equations given here: https://matlabgeeks.com/tips-tutorials/modeling-with-odes-in-matlab-part-5b/
The equations are those of a chaotic Lorenz system. The tutorial is quite easy to understand but what I do not follow is how to set the number of data points to generate i.e., the length of the time series? Which parameter helps to decide to generate how many data points will be generated. Can somebody please help? I have looked into other resources as well but I could not understand. For instance, by trial and error I found that if I specify
eps = 0.000001; T = [0 45] then the number of data points are about 7000. If I want the number of data points to 10,000 I don't know what the values of these parameters should be.

As described in the article (and the previous parts 1 and 2 of the series), the sequence of sample points is generated dynamically so that each segment contributes about the same amount of truncation error towards the global error, weighted by the absolute and relative tolerances. Additionally, it uses interpolation inside the segment to produce 3 inner points so that a plot will appear curved also for large tolerances. That is, the internal segmentation is given by T(1:4:end), the other points are interpolated.
You can also prescribe your own sample times, the values there get likewise interpolated from the "dense output", the interpolations over the internally produced segmentation.
T = linspace(t0, tend, 7000);
Y = ode45('lorenz', T, Y0, options);
You could also extract the dense output via
sol = ode45('lorenz', [t0 tend], Y0, options);
and then use the provided interpolation to compute samples at arbitrary times
Y = deval(sol,T);
In Empirical error proof Runge-Kutta algorithm ... I also computed the error for the Lorenz system for a fixed-step RK method, which shows the same divergence of the solutions after a relatively short time.

Related

Matlab fitting error using lsqcurvefit

I'm developing code to fit the Gompertz equation to a bacterial growth curve and am practicing with some example data provided at the following website:
http://www.math.tamu.edu/~phoward/m442/ia3sol.pdf.
According to this code the fit should almost match the data (graph given at above webpage, page 3). However when I run the code the actual data plots correctly but the lsqcurve fit fits very poorly and gives the following message:
Local minimum possible.
lsqcurvefit stopped because the size of the current step is less than
the default value of the step size tolerance.
Is there anything I am doing wrong?
Thank you for your time,
Laura
The problem lies in the linked document.
The Gompertz function is parametrized the following way:
%with parameters p(1) = K and p(2) = initial population
%p(3) = r.
V = p(1).*(p(2)/p(1)).^exp(-p(3)*t);
However, the initial parameters for the curve fitting are given for a different ordering of parameters in p vector ([r, K, p0] instead of [K, p0, r]). Moreover, the result vector is also messed up in the document.
By changing p0 to this [1000, 3.93, 0.01] the curve fitting will converge and you will get a nice fit:

Matlab ode functions to get specified number of values/outputs

I have a function file with my differential equations, I am performing a ode23s on the function in the standard form i.e
[t,m]=ode23s('DE_function',tspan,[mA pA mB pB mC pC mD],optionsDE,p)
I obtain about 150 values/results/output for each mA and so on. My ode23s is working fine.
I have experimental dataset for the same mA and so on which i have to use to calculate the least squared error.. i am trying to do this:
a = m(:,1) - A(:,2); and so on. Here in my experimental data, I have just 20 values/results/outputs etc according to 20 time points. I have defined the same time points for the tspan as well. But since my matrices do not match in dimension, i am unable to proceed with my calculations. Is there a way to receive exactly 20 values according to the 20 time points such as 1, 2, etc in the ode23s as well, or may be a way to get and store them only.
I have been trying to find a solution for this error but unable to find anything suitable. Many thanks for any kind of suggestions and hits.
The Matlab documentation has all you need. When you call ode23 you can specify the time locations in tspan.
"Interval of integration, specified as a vector. At minimum, tspan must be a two element vector [t0 tf] specifying the initial and final times. To obtain solutions at specific times between t0 and tf, use a longer vector of the form [t0,t1,t2,...,tf]. The elements in tspan must be all increasing or all decreasing."

Mixture of 1D Gaussians fit to data in Matlab / Python

I have a discrete curve y=f(x). I know the locations and amplitudes of peaks. I want to approximate the curve by fitting a gaussian at each peak. How should I go about finding the optimized gaussian parameters ? I would like to know if there is any inbuilt function which will make my task simpler.
Edit
I have fixed mean of gaussians and tried to optimize on sigma using
lsqcurvefit() in matlab. MSE is less. However, I have an additional hard constraint that the value of approximate curve should be equal to the original function at the peaks. This constraint is not satisfied by my model. I am pasting current working code here. I would like to have a solution which obeys the hard constraint at peaks and approximately fits the curve at other points. The basic idea is that the approximate curve has fewer parameters but still closely resembles the original curve.
fun = #(x,xdata)myFun(x,xdata,pks,locs); %pks,locs are the peak locations and amplitudes already available
x0=w(1:6)*0.25; % my initial guess based on domain knowledge
[sigma resnorm] = lsqcurvefit(fun,x0,xdata,ydata); %xdata and ydata are the original curve data points
recons = myFun(sigma,xdata,pks,locs);
figure;plot(ydata,'r');hold on;plot(recons);
function f=myFun(sigma,xdata,a,c)
% a is constant , c is mean of individual gaussians
f=zeros(size(xdata));
for i = 1:6 %use 6 gaussians to approximate function
f = f + a(i) * exp(-(xdata-c(i)).^2 ./ (2*sigma(i)^2));
end
end
If you know your peak locations and amplitudes, then all you have left to do is find the width of each Gaussian. You can think of this as an optimization problem.
Say you have x and y, which are samples from the curve you want to approximate.
First, define a function g() that will construct the approximation for given values of the widths. g() takes a parameter vector sigma containing the width of each Gaussian. The locations and amplitudes of the Gaussians will be constrained to the values you already know. g() outputs the value of the sum-of-gaussians approximation at each point in x.
Now, define a loss function L(), which takes sigma as input. L(sigma) returns a scalar that measures the error--how badly the given approximation (using sigma) differs from the curve you're trying to approximate. The squared error is a common loss function for curve fitting:
L(sigma) = sum((y - g(sigma)) .^ 2)
The task now is to search over possible values of sigma, and find the choice that minimizes the error. This can be done using a variety of optimization routines.
If you have the Mathworks optimization toolbox, you can use the function lsqnonlin() (in this case you won't have to define L() yourself). The curve fitting toolbox is probably an alternative. Otherwise, you can use an open source optimization routine (check out cvxopt).
A couple things to note. You need to impose the constraint that all values in sigma are greater than zero. You can tell the optimization algorithm about this constraint. Also, you'll need to specify an initial guess for the parameters (i.e. sigma). In this case, you could probably choose something reasonable by looking at the curve in the vicinity of each peak. It may be the case (when the loss function is nonconvex) that the final solution is different, depending on the initial guess (i.e. you converge to a local minimum). There are many fancy techniques for dealing with this kind of situation, but a simple thing to do is to just try with multiple different initial guesses, and pick the best result.
Edited to add:
In python, you can use optimization routines in the scipy.optimize module, e.g. curve_fit().
Edit 2 (response to edited question):
If your Gaussians have much overlap with each other, then taking their sum may cause the height of the peaks to differ from your known values. In this case, you could take a weighted sum, and treat the weights as another parameter to optimize.
If you want the peak heights to be exactly equal to some specified values, you can enforce this constraint in the optimization problem. lsqcurvefit() won't be able to do it because it only handles bound constraints on the parameters. Take a look at fmincon().
you can use Expectation–Maximization algorithm for fitting Mixture of Gaussians on your data. it don't care about data dimension.
in documentation of MATLAB you can lookup gmdistribution.fit or fitgmdist.

Using Linear Prediction Over Time Series to Determine Next K Points

I have a time series of N data points of sunspots and would like to predict based on a subset of these points the remaining points in the series and then compare the correctness.
I'm just getting introduced to linear prediction using Matlab and so have decided that I would go the route of using the following code segment within a loop so that every point outside of the training set until the end of the given data has a prediction:
%x is the data, training set is some subset of x starting from beginning
%'unknown' is the number of points to extend the prediction over starting from the
%end of the training set (i.e. difference in length of training set and data vectors)
%x_pred is set to x initially
p = length(training_set);
coeffs = lpc(training_set, p);
for i=1:unknown
nextValue = -coeffs(2:end) * x_pred(end-unknown-1+i:-1:end-unknown-1+i-p+1)';
x_pred(end-unknown+i) = nextValue;
end
error = norm(x - x_pred)
I have three questions regarding this:
1) Does this appropriately do what I have described? I ask because my error seems rather large (>100) when predicting over only the last 20 points of a dataset that has hundreds of points.
2) Am I interpreting the second argument of lpc correctly? Namely, that it means the 'order' or rather number of points that you want to use in predicting the next point?
3) If this is there a more efficient, single line function in Matlab that I can call to replace the looping and just compute all necessary predictions for me given some subset of my overall data as a training set?
I tried looking through the lpc Matlab tutorial but it didn't seem to do the prediction as I have described my needs require. I have also been using How to use aryule() in Matlab to extend a number series? as a reference.
So after much deliberation and experimentation I have found the above approach to be correct and there does not appear to be any single Matlab function to do the above work. The large errors experienced are reasonable since I am using a linear prediction algorithm for a problem (i.e. sunspot prediction) that has inherent nonlinear behavior.
Hope this helps anyone else out there working on something similar.

Matlab recursive curve fitting with custom equations

I have a curve IxV. I also have an equation that I want to fit in this IxV curve, so I can adjust its constants. It is given by:
I = I01(exp((V-R*I)/(n1*vth))-1)+I02(exp((V-R*I)/(n2*vth))-1)
vth and R are constants already known, so I only want to achieve I01, I02, n1, n2. The problem is: as you can see, I is dependent on itself. I was trying to use the curve fitting toolbox, but it doesn't seem to work on recursive equations.
Is there a way to make the curve fitting toolbox work on this? And if there isn't, what can I do?
Assuming that I01 and I02 are variables and not functions, then you should set the problem up like this:
a0 = [I01 I02 n1 n2];
MinFun = #(a) abs(a(1)*(exp(V-R*I)/(a(3)*vth))-1) + a(2)*(exp((V-R*I)/a(4)*vth))-1) - I);
aout = fminsearch(a0,MinFun);
By subtracting I and taking the absolute value, the point where both sides are equal will be the point where MinFun is zero (minimized).
No, the CFTB cannot fit such recursively defined functions. And errors in I, since the true value of I is unknown for any point, will create a kind of errors in variables problem. All you have are the "measured" values for I.
The problem of errors in I MAY be serious, since any errors in I, or lack of fit, noise, model problems, etc., will be used in the expression itself. Then you exponentiate these inaccurate values, potentially casing a mess.
You may be able to use an iterative approach. Thus something like
% 0. Initialize I_pred
I_pred = I;
% 1. Estimate the values of your coefficients, for this model:
% (The curve fitting toolbox CAN solve this problem, given I_pred)
I = I01(exp((V-R*I_pred)/(n1*vth))-1)+I02(exp((V-R*I_pred)/(n2*vth))-1)
% 2. Generate new predictions for I_pred
I_pred = I01(exp((V-R*I_pred)/(n1*vth))-1)+I02(exp((V-R*I_pred)/(n2*vth))-1)
% Repeat steps 1 and 2 until the parameters from the CFTB stabilize.
The above pseudo-code will work only if your starting values are good, and there are not large errors/noise in the model/data. Even on a good day, the above approach may not converge well. But I see little hope otherwise.