weighted curve fitting with lsqcurvefit - matlab

I wanted to fit an arbitrary function to my data set. Therefore, I used lsqcurvefit in MATLAB. Now I want to give weight to the fit procedure, meaning when curve fitting function (lsqcurvefit) is calculating the residue of the fit, some data point are more important than the others. To be more specific I want to use statistical weighting method.
w=1/y(x),
where w is a matrix contains the weight of each data point and y is the data set.
I cannot find anyway to make weighted curve fitting with lsqcurvefit. Is there any trick I should follow or is there any other function rather than lsqcurvefit which do it for me?

For doing weighting, I find it much easier to use lsqnonlin which is the function that lsqcurvefit calls to do the actual fitting.
You first have to define a function that you are trying to minimize, ie. a cost function. You need to pass in your weighting function as an extra parameter to your function as a vector:
x = yourIndependentVariable;
y = yourData;
weightVector = sqrt(abs(1./y));
costFunction = #(A) weightVector.*(yourModelFunction(A) - y);
aFit = lsqnonlin(costFunction,aGuess);
The reason for the square root in the weighting function definition is that lsqnonlin requires the residuals, not the squared residuals or their sum, so you need to pre-unsquare the weights.
Alternatively, if you have the Statistics Toolbox, you can use nlinfit which will accept a weighting vector/matrix as one of the optional inputs.

Related

How to Fit a decay exponential function in Matlab

I have to fit the dots, results of measurements, by an exponential function on Matlab. My profesor asked me to use only
fminsearch
polyval
polyfit
One of them or both. I have to find the parameters a and b (the value) which are fitting it.
There is the lines I wrote :
x=[1:10:70]
y=[0:10:70]
x=[12.5,11.8,10.8,10.9,6.5,6.2,6.1,5.423,4.625]
y=[0,0.61,1.3,1.4,14.9,18.5,20.1,29.7,58.2]
xlabel('Conductivité')
ylabel('Inductance')
The function has the form a*e^(-b*x) +c
Well polyfit and polyval are only usefull for working with polynomials. So you would have to write a minimization problem of the form min(f(x)).
functionToMinimize = #(pars, x, y)(norm(pars(1).*exp(-pars(2).*x) - y));
targetFunctionForFminseardch = #(pars)(functionToMinimize(pars, x, y));
minPars = fminsearch(targetFunctionForFminseardch, [0, 1])
Read up on anonymous functions and the use of vector norms if you have questions how to construct such a minimization problem.
Your code also has some flaws. Why are you defining x and y twice when you only want to use the actual measured data?

MATLAB: polyval function for N greater than 1

I am trying trying to graph the polynomial fit of a 2D dataset in Matlab.
This is what I tried:
rawTable = readtable('Test_data.xlsx','Sheet','Sheet1');
x = rawTable.A;
y = rawTable.B;
figure(1)
scatter(x,y)
c = polyfit(x,y,2);
y_fitted = polyval(c,x);
hold on
plot(x,y_fitted,'r','LineWidth',2)
rawTable.A and rawTable.A are randomly generated numbers. (i.e. the x dataset cannot be represented in the following form : x=0:0.1:100)
The result:
second-order polynomial
But the result I expect looks like this (generated in Excel):
enter image description here
How can I graph the second-order polynomial fit in MATLAB?
I sense some confusion regarding what the output of each of those Matlab function mean. So I'll clarify. And I think we need some details as well. So expect some verbosity. A quick answer, however, is available at the end.
c = polyfit(x,y,2) gives the coefficient vectors of the polynomial fit. You can get the fit information such as error estimate following the documentation.
Name this polynomial as P. P in Matlab is actually the function P=#(x)c(1)*x.^2+c(2)*x+c(3).
Suppose you have a single point X, then polyval(c,X) outputs the value of P(X). And if x is a vector, polyval(c,x) is a vector corresponding to [P(x(1)), P(x(2)),...].
Now that does not represent what the fit is. Just as a quick hack to see something visually, you can try plot(sort(x),polyval(c,sort(x)),'r','LineWidth',2), ie. you can first sort your data and try plotting on those x-values.
However, it is only a hack because a) your data set may be so irregularly spaced that the spline doesn't represent function or b) evaluating on the whole of your data set is unnecessary and inefficient.
The robust and 'standard' way to plot a 2D function of known analytical form in Matlab is as follows:
Define some evenly-spaced x-values over the interval you want to plot the function. For example, x=1:0.1:10. For example, x=linspace(0,1,100).
Evaluate the function on these x-values
Put the above two components into plot(). plot() can either plot the function as sampled points, or connect the points with automatic spline, which is the default.
(For step 1, quadrature is ambiguous but specific enough of a term to describe this process if you wish to communicate with a single word.)
So, instead of using the x in your original data set, you should do something like:
t=linspace(min(x),max(x),100);
plot(t,polyval(c,t),'r','LineWidth',2)

Is it possible to constrain the outcome of lsqnonlin in matlab to only real solutions?

I would like to fit a transfer function of a PD-controller + time-delay to frequency response data ( so these are complex numbers) in matlab. The fit function is: (P+Diw) exp(tau*iw) I used least-squares minimization with the Matlab function lsqnonlin and set the start and boundary values, to obtain the parameters P, D and tau. These values should be real numbers, however I obtain complex numbers, because my function and data are complex as well. Is there a way to constrain the solution to only real numbers?

Mixture of 1D Gaussians fit to data in Matlab / Python

I have a discrete curve y=f(x). I know the locations and amplitudes of peaks. I want to approximate the curve by fitting a gaussian at each peak. How should I go about finding the optimized gaussian parameters ? I would like to know if there is any inbuilt function which will make my task simpler.
Edit
I have fixed mean of gaussians and tried to optimize on sigma using
lsqcurvefit() in matlab. MSE is less. However, I have an additional hard constraint that the value of approximate curve should be equal to the original function at the peaks. This constraint is not satisfied by my model. I am pasting current working code here. I would like to have a solution which obeys the hard constraint at peaks and approximately fits the curve at other points. The basic idea is that the approximate curve has fewer parameters but still closely resembles the original curve.
fun = #(x,xdata)myFun(x,xdata,pks,locs); %pks,locs are the peak locations and amplitudes already available
x0=w(1:6)*0.25; % my initial guess based on domain knowledge
[sigma resnorm] = lsqcurvefit(fun,x0,xdata,ydata); %xdata and ydata are the original curve data points
recons = myFun(sigma,xdata,pks,locs);
figure;plot(ydata,'r');hold on;plot(recons);
function f=myFun(sigma,xdata,a,c)
% a is constant , c is mean of individual gaussians
f=zeros(size(xdata));
for i = 1:6 %use 6 gaussians to approximate function
f = f + a(i) * exp(-(xdata-c(i)).^2 ./ (2*sigma(i)^2));
end
end
If you know your peak locations and amplitudes, then all you have left to do is find the width of each Gaussian. You can think of this as an optimization problem.
Say you have x and y, which are samples from the curve you want to approximate.
First, define a function g() that will construct the approximation for given values of the widths. g() takes a parameter vector sigma containing the width of each Gaussian. The locations and amplitudes of the Gaussians will be constrained to the values you already know. g() outputs the value of the sum-of-gaussians approximation at each point in x.
Now, define a loss function L(), which takes sigma as input. L(sigma) returns a scalar that measures the error--how badly the given approximation (using sigma) differs from the curve you're trying to approximate. The squared error is a common loss function for curve fitting:
L(sigma) = sum((y - g(sigma)) .^ 2)
The task now is to search over possible values of sigma, and find the choice that minimizes the error. This can be done using a variety of optimization routines.
If you have the Mathworks optimization toolbox, you can use the function lsqnonlin() (in this case you won't have to define L() yourself). The curve fitting toolbox is probably an alternative. Otherwise, you can use an open source optimization routine (check out cvxopt).
A couple things to note. You need to impose the constraint that all values in sigma are greater than zero. You can tell the optimization algorithm about this constraint. Also, you'll need to specify an initial guess for the parameters (i.e. sigma). In this case, you could probably choose something reasonable by looking at the curve in the vicinity of each peak. It may be the case (when the loss function is nonconvex) that the final solution is different, depending on the initial guess (i.e. you converge to a local minimum). There are many fancy techniques for dealing with this kind of situation, but a simple thing to do is to just try with multiple different initial guesses, and pick the best result.
Edited to add:
In python, you can use optimization routines in the scipy.optimize module, e.g. curve_fit().
Edit 2 (response to edited question):
If your Gaussians have much overlap with each other, then taking their sum may cause the height of the peaks to differ from your known values. In this case, you could take a weighted sum, and treat the weights as another parameter to optimize.
If you want the peak heights to be exactly equal to some specified values, you can enforce this constraint in the optimization problem. lsqcurvefit() won't be able to do it because it only handles bound constraints on the parameters. Take a look at fmincon().
you can use Expectation–Maximization algorithm for fitting Mixture of Gaussians on your data. it don't care about data dimension.
in documentation of MATLAB you can lookup gmdistribution.fit or fitgmdist.

Curve Fitting for equation with two parameters

I have two arrays:
E= [6656400;
13322500;
19980900;
26625600;
33292900;
39942400;
46648900;
53290000]
and
J=[0.0000000021;
0.0000000047;
0.0000000128;
0.0000000201;
0.0000000659;
0.0000000748;
0.0000001143;
0.0000001397]
I want to find the appropriate curve fitting for the above data by applying this equation:
J=A0.*(298).^2.*exp(-(W-((((1.6e-19)^3)/(4*pi*2.3*8.854e-12))^0.5).*E.^0.5)./((1.38e-23).*298))
I want to select the starting value of W from 1e-19
I have tried the curve fitting tools but it is not helping me to solve it!
Then, I selected some random values of A0=1.2e9 and W=2.243e-19, it gave me a better results. But I want to find the right values by using the code (not the curve fitting Apps)
Can you help me please?
A quick (and potentially easy) solution method would be to pose the curve fit as a minimization problem.
Define a correlation function that takes the fit parameters as an argument:
% x(1) == A0; x(2) == W
Jfunc = #(x) x(1).*(298).^2.*exp(-(x(2)-((((1.6e-19)^3)/(4*pi*2.3*8.854e-12))^0.5).*E.^0.5)./((1.38e-23).*298));
Then a objective function to minimize. Since you have data J we'll minimize the sum-of-squares of the difference between the data and the correlation:
Objective = #(x) sum((Jfunc(x) - J).^2);
And then attempt to minimize the objective using fminsearch:
x0 = [1.2E9;2.243E-19];
sol = fminsearch(Objective,x0);
I used the guesses you gave. For nonlinear solutions, a good first guess is often important for convergence.
If you have the Optimization Toolbox, you can also try lsqcurvefit or lsqnonlin (fminsearch is vanilla MATLAB).