How to fit polynomial into some error bars data

How to fit polynomial into some error bars data - matlab

I need to fit data e.g. x, y, CI (where CI is confidence index of y) in Matlab.
Now, I use this code:
pf = polyfit(x, y, 2);
x1 = min(x):.1:max(x);
y1 = polyval(pf, x1);
figure
hold on
errorbar(x, y, CI, 'ko');
plot(x1, y1, 'k');
hold off
Of course, the fit comes out of some errors bars, and it's correct.
I would like obtain a fit curve closer to the points with a low confidence index, and discard the points with a high confidence index.
Thank you and bye,
Giacomo

What you are looking for are Weighted Least Squares. You can compute them with the function lscov. There is a nice example in its help page, but I'll try to make it clearer.
Let us construct a simple parabola, with a corrupted point
x = (0:0.1:1)';
y = 0.5*x.^2;
y(5) = 3*y(5);
and give some weights
w = ones(size(y));
w(5) = 0.1;
Next build the Vandermonde matrix (see here for the code) and solve the system
%// V = [x.^2 x ones(size(x))];
V = bsxfun(#power, x, 2:-1:0);
coeff = lscov(V, y, w);
The estimated coefficients, with and without the weights, are
x^2 x 1
with weights [0.4797 0.0186 -0.0004]
no weights [0.3322 0.1533 -0.0034]
Note that in your case w will have to be inverted.
If you don't like to build the Vandermonde matrix, and you have a license for the Curve Fitting Toolbox, you can use the following code
ft = fittype('poly2');
opts = fitoptions('Method', 'LinearLeastSquares');
opts.Weights = w;
fitresult = fit(x, y, ft, opts);
and you'll obtain the same result.

Related

Constrained linear least squares not fitting data

I am trying to fit a 3D surface polynomial of n-degrees to some data points in 3D space. My system requires the surface to be monotonically increasing in the area of interest, that is the partial derivatives must be non-negative. This can be achieved using Matlab's built in lsqlin function.
So here's what I've done to try and achieve this:
I have a function that takes in four parameters;
x1 and x2 are my explanatory variables and y is my dependent variable. Finally, I can specify order of polynomial fit. First I build the design matrix A using data from x1 and x2 and the degree of fit I want. Next I build the matrix D that is my container for the partial derivatives of my datapoints. NOTE: the matrix D is double the length of matrix A since all datapoints must be differentiated with respect to both x1 and x2. I specify that Dx >= 0 by setting b to be zeroes.
Finally, I call lsqlin. I use "-D" since Matlab defines the function as Dx <= b.
function w_mono = monotone_surface_fit(x1, x2, y, order_fit)
% Initialize design matrix
A = zeros(length(x1), 2*order_fit+2);
% Adjusting for bias term
A(:,1) = ones(length(x1),1);
% Building design matrix
for i = 2:order_fit+1
A(:,(i-1)*2:(i-1)*2+1) = [x1.^(i-1), x2.^(i-1)];
end
% Initialize matrix containing derivative constraint.
% NOTE: Partial derivatives must be non-negative
D = zeros(2*length(y), 2*order_fit+1);
% Filling matrix that holds constraints for partial derivatives
% NOTE: Matrix D will be double length of A since all data points will have a partial derivative constraint in both x1 and x2 directions.
for i = 2:order_fit+1
D(:,(i-1)*2:(i-1)*2+1) = [(i-1)*x1.^(i-2), zeros(length(x2),1); ...
zeros(length(x1),1), (i-1)*x2.^(i-2)];
end
% Limit of derivatives
b = zeros(2*length(y), 1);
% Constrained LSQ fit
options = optimoptions('lsqlin','Algorithm','interior-point');
% Final weights of polynomial
w_mono = lsqlin(A,y,-D,b,[],[],[],[],[], options);
end
So now I get some weights out, but unfortunately they do not at all capture the structure of the data. I've attached an image so you can just how bad it looks. .
I'll give you my plotting script with some dummy data, so you can try it.
%% Plot different order polynomials to data with constraints
x1 = [-5;12;4;9;18;-1;-8;13;0;7;-5;-8;-6;14;-1;1;9;14;12;1;-5;9;-10;-2;9;7;-1;19;-7;12;-6;3;14;0;-8;6;-2;-7;10;4;-5;-7;-4;-6;-1;18;5;-3;3;10];
x2 = [81.25;61;73;61.75;54.5;72.25;80;56.75;78;64.25;85.25;86;80.5;61.5;79.25;76.75;60.75;54.5;62;75.75;80.25;67.75;86.5;81.5;62.75;66.25;78.25;49.25;82.75;56;84.5;71.25;58.5;77;82;70.5;81.5;80.75;64.5;68;78.25;79.75;81;82.5;79.25;49.5;64.75;77.75;70.25;64.5];
y = [-6.52857142857143;-1.04736842105263;-5.18750000000000;-3.33157894736842;-0.117894736842105;-3.58571428571429;-5.61428571428572;0;-4.47142857142857;-1.75438596491228;-7.30555555555556;-8.82222222222222;-5.50000000000000;-2.95438596491228;-5.78571428571429;-5.15714285714286;-1.22631578947368;-0.340350877192983;-0.142105263157895;-2.98571428571429;-4.35714285714286;-0.963157894736842;-9.06666666666667;-4.27142857142857;-3.43684210526316;-3.97894736842105;-6.61428571428572;0;-4.98571428571429;-0.573684210526316;-8.22500000000000;-3.01428571428571;-0.691228070175439;-6.30000000000000;-6.95714285714286;-2.57232142857143;-5.27142857142857;-7.64285714285714;-2.54035087719298;-3.45438596491228;-5.01428571428571;-7.47142857142857;-5.38571428571429;-4.84285714285714;-6.78571428571429;0;-0.973684210526316;-4.72857142857143;-2.84285714285714;-2.54035087719298];
% Used to plot the surface in all points in the grid
X1 = meshgrid(-10:1:20);
X2 = flipud(meshgrid(30:2:90).');
figure;
for i = 1:4
w_mono = monotone_surface_fit(x1, x2, y, i);
y_nr = w_mono(1)*ones(size(X1)) + w_mono(2)*ones(size(X2));
for j = 1:i
y_nr = w_mono(j*2)*X1.^j + w_mono(j*2+1)*X2.^j;
end
subplot(2,2,i);
scatter3(x1, x2, y); hold on;
axis tight;
mesh(X1, X2, y_nr);
set(gca, 'ZDir','reverse');
xlabel('x1'); ylabel('x2');
zlabel('y');
% zlim([-10 0])
end
I think it may have something to do with the fact that I haven't specified anything about the region of interest, but really I don't know. Thanks in advance for any help.

Alright I figured it out.
The main problem was simply an error in the plotting script. The value of y_nr should be updated and not overwritten in the loop.
Also I figured out that the second derivative should be monotonically decreasing. Here's the updated code if anybody is interested.
%% Plot different order polynomials to data with constraints
x1 = [-5;12;4;9;18;-1;-8;13;0;7;-5;-8;-6;14;-1;1;9;14;12;1;-5;9;-10;-2;9;7;-1;19;-7;12;-6;3;14;0;-8;6;-2;-7;10;4;-5;-7;-4;-6;-1;18;5;-3;3;10];
x2 = [81.25;61;73;61.75;54.5;72.25;80;56.75;78;64.25;85.25;86;80.5;61.5;79.25;76.75;60.75;54.5;62;75.75;80.25;67.75;86.5;81.5;62.75;66.25;78.25;49.25;82.75;56;84.5;71.25;58.5;77;82;70.5;81.5;80.75;64.5;68;78.25;79.75;81;82.5;79.25;49.5;64.75;77.75;70.25;64.5];
y = [-6.52857142857143;-1.04736842105263;-5.18750000000000;-3.33157894736842;-0.117894736842105;-3.58571428571429;-5.61428571428572;0;-4.47142857142857;-1.75438596491228;-7.30555555555556;-8.82222222222222;-5.50000000000000;-2.95438596491228;-5.78571428571429;-5.15714285714286;-1.22631578947368;-0.340350877192983;-0.142105263157895;-2.98571428571429;-4.35714285714286;-0.963157894736842;-9.06666666666667;-4.27142857142857;-3.43684210526316;-3.97894736842105;-6.61428571428572;0;-4.98571428571429;-0.573684210526316;-8.22500000000000;-3.01428571428571;-0.691228070175439;-6.30000000000000;-6.95714285714286;-2.57232142857143;-5.27142857142857;-7.64285714285714;-2.54035087719298;-3.45438596491228;-5.01428571428571;-7.47142857142857;-5.38571428571429;-4.84285714285714;-6.78571428571429;0;-0.973684210526316;-4.72857142857143;-2.84285714285714;-2.54035087719298];
% Used to plot the surface in all points in the grid
X1 = meshgrid(-10:1:20);
X2 = flipud(meshgrid(30:2:90).');
figure;
for i = 1:4
w_mono = monotone_surface_fit(x1, x2, y, i);
% NOTE: Should only have 1 bias term
y_nr = w_mono(1)*ones(size(X1));
for j = 1:i
y_nr = y_nr + w_mono(j*2)*X1.^j + w_mono(j*2+1)*X2.^j;
end
subplot(2,2,i);
scatter3(x1, x2, y); hold on;
axis tight;
mesh(X1, X2, y_nr);
set(gca, 'ZDir','reverse');
xlabel('x1'); ylabel('x2');
zlabel('y');
% zlim([-10 0])
end
And here's the updated function
function [w_mono, w] = monotone_surface_fit(x1, x2, y, order_fit)
% Initialize design matrix
A = zeros(length(x1), 2*order_fit+1);
% Adjusting for bias term
A(:,1) = ones(length(x1),1);
% Building design matrix
for i = 2:order_fit+1
A(:,(i-1)*2:(i-1)*2+1) = [x1.^(i-1), x2.^(i-1)];
end
% Initialize matrix containing derivative constraint.
% NOTE: Partial derivatives must be non-negative
D = zeros(2*length(y), 2*order_fit+1);
for i = 2:order_fit+1
D(:,(i-1)*2:(i-1)*2+1) = [(i-1)*x1.^(i-2), zeros(length(x2),1); ...
zeros(length(x1),1), -(i-1)*x2.^(i-2)];
end
% Limit of derivatives
b = zeros(2*length(y), 1);
% Constrained LSQ fit
options = optimoptions('lsqlin','Algorithm','active-set');
w_mono = lsqlin(A,y,-D,b,[],[],[],[],[], options);
w = lsqlin(A,y);
end
Finally a plot of the fitting (Have used a new simulation, but fit also works on given dummy data).

Multivariate Normal Distribution Matlab, probability area

I have 2 arrays: one with x-coordinates, the other with y-coordinates.
Both are a normal distribution as a result of a Monte-Carlo simulation. I know how to find the sigma and mu for both array's, and get a 95% confidence interval:
[mu,sigma]=normfit(x_array);
hist(x_array);
x=norminv([0.025 0.975],mu,sigma)
However, both array's are correlated with each other. To plot the probability distribution of the combined array's, i use the multivariate normal distribution. In MATLAB this gives me:
[MuX,SigmaX]=normfit(x_array);
[MuY,SigmaY]=normfit(y_array);
mu = [MuX MuY];
Sigma=cov(x_array,y_array);
x1 = MuX-4*SigmaX:5:MuX+4*SigmaX; x2 = MuY-4*SigmaY:5:MuY+4*SigmaY;
[X1,X2] = meshgrid(x1,x2);
F = mvnpdf([X1(:) X2(:)],mu,Sigma);
F = reshape(F,length(x2),length(x1));
surf(x1,x2,F);
caxis([min(F(:))-.5*range(F(:)),max(F(:))]);
set(gca,'Ydir','reverse')
xlabel('x0-as'); ylabel('y0-as'); zlabel('Probability Density');
So far so good. Now I want to calculate the 95% probability area. I'am looking for a function as mndinv, just as norminv. However, such a function doesn't exist in MATLAB, which makes sense because there are endless possibilities... Does somebody have a tip about how to get a 95% probability area? Thanks in advance.

For the bivariate case you can add the ellispe whose area corresponds to NORMINV(95%). This ellipse is uniquely identified and for proof see the first source in the link.
% Suppose you know the distribution params, or you got them from normfit()
mu = [3, 7];
sigma = [1, 2.5
2.5 9];
% X/Y values for plotting grid
x = linspace(mu(1)-3*sqrt(sigma(1)), mu(1)+3*sqrt(sigma(1)),100);
y = linspace(mu(2)-3*sqrt(sigma(end)), mu(2)+3*sqrt(sigma(end)),100);
% Z values
[X1,X2] = meshgrid(x,y);
Z = mvnpdf([X1(:) X2(:)],mu,sigma);
Z = reshape(Z,length(y),length(x));
% Plot
h = pcolor(x,y,Z);
set(h,'LineStyle','none')
hold on
% Add level set
alpha = 0.05;
r = sqrt(-2*log(alpha));
rho = sigma(2)/sqrt(sigma(1)*sigma(end));
M = [sqrt(sigma(1)) rho*sqrt(sigma(end))
0 sqrt(sigma(end)-sigma(end)*rho^2)];
theta = 0:0.1:2*pi;
f = bsxfun(#plus, r*[cos(theta)', sin(theta)']*M, mu);
plot(f(:,1), f(:,2),'--r')
Sources
https://upload.wikimedia.org/wikipedia/commons/a/a2/Cumulative_function_n_dimensional_Gaussians_12.2013.pdf
https://en.wikipedia.org/wiki/Multivariate_normal_distribution

To get the numerical value of F where the top part lies, you should use top5=prctile(F(:),95) . This will return the value of F that limits the bottom 95% of data with the top 5%.
Then you can get just the top 5% with
Ftop=zeros(size(F));
Ftop=F>top5;
Ftop=Ftop.*F;
%// optional: Ftop(Ftop==0)=NaN;
surf(x1,x2,Ftop,'LineStyle','none');

Least squares fit, unknown intercerpt

I have three data points through which I have to fit a straight line of the form Y=m*X+C. I want the line to have pre-determined slope 'm' but the constant'C' can change to get the least error while fitting using matlab. Can someone help me out?

Just do the math:
C= mean(Y)-m*mean(X)
assuming Y is the vector containing the y coordinates, and X the x coordinates.
Reference: http://hotmath.com/hotmath_help/topics/line-of-best-fit.html

If you opt to use the Curve Fitting Toolbox the solution is as follows.
To start generate some data
m = 3;
x = (1:10).';
y = m*x + 2 + randn(size(x));
then select the model to fit and set the bounds for its coefficients
ft = fittype('poly1');
opts = fitoptions('Method', 'LinearLeastSquares');
opts.Lower = [m -Inf];
opts.Upper = [m Inf];
finally call the fitting routine
[fitresult, gof] = fit(x, y, ft, opts);
The intercept is stored in fitresult.p2.

Parabola plot with data in Octave

I have been trying to fit a parabola to parts of data where y is positive. I am told, that P1(x1,y1) is the first data point, Pg(xg,yg) is the last, and that the top point is at x=(x1+xg)/2. I have written the following:
x=data(:,1);
y=data(:,2);
filter=y>0;
xp=x(filter);
yp=y(filter);
P1=[xp(1) yp(1)];
Pg=[xp(end) yp(end)];
xT=(xp(1)+xp(end))/2;
A=[1 xp(1) power(xp(1),2) %as the equation is y = a0 + x*a1 + x^2 * a2
1 xp(end) power(xp(end),2)
0 1 2*xT]; % as the top point satisfies dy/dx = a1 + 2*x*a2 = 0
b=[yg(1) yg(end) 0]'; %'
p=A\b;
x_fit=[xp(1):0.1:xp(end)];
y_fit=power(x_fit,2).*p(3)+x_fit.*p(2)+p(1);
figure
plot(xp,yp,'*')
hold on
plot(x_fit,y_fit,'r')
And then I get this parabola which is completely wrong. It doesn't fit the data at all! Can someone please tell me what's wrong with the code?
My parabola

Well, I think the primary problem is some mistake in your calculation. I think you should use three points on the parabola to obtain a system of linear equations. There is no need to calculate the derivative of your function as you do with
dy/dx = a1 + 2*x*a2 = 0
Instead of a point on the derivative you choose another point in your scatter plot, e.g. the maximum: PT = [xp_max yp_max]; and use it for your matrix A and b.
The equation dy/dx = a1 + 2*x*a2 = 0 does not fulfill the basic scheme of your system of linear equations: a0 + a1*x + a2*x^2 = y;
By the way: If you don't have to calculate your parabola necessarily in this way, you can maybe have a look at the Matlab/Octave-function polyfit() which calculates the least squares solution for your problem. This would result in a simple implementation:
p = polyfit(x, y, 2);
y2 = polyval(p, x);
figure(); plot(x, y, '*'); hold on;
plot(x, y2, 'or');

matlab determine curve

Does anyone know how to obtain a mean curve having a matrix with the correspondent x,y points from the original plot? I mean, I pretend a medium single curve.
Any code or just ideas would be very very helpful for me since I am new with matlab.
Thank you very much!

Well, one thing you can do is fit a parametric curve. Here's an example on how to do this for a figure-8 with noise on it:
function findParamFit
clc, clf, hold on
%# some sample data
noise = #(t) 0.25*rand(size(t))-0.125;
x = #(t) cos(t) + noise(t);
y = #(t) sin(2*t) + noise(t);
t = linspace(-100*rand, +100*rand, 1e4);
%# initial data
plot(x(t), y(t), 'b.')
%# find fits
options = optimset(...
'tolfun', 1e-12,...
'tolx', 1e-12);
a = lsqcurvefit(#myFun_x, [1 1], t, x(t), -10,10, options);
b = lsqcurvefit(#myFun_y, [1 2], t, y(t), -10,10, options);
%# fitted curve
xx = myFun_x(a,t);
yy = myFun_y(b,t);
plot(xx, yy, 'r.')
end
function F = myFun_x(a, tt)
F = a(1)*cos(a(2)*tt);
end
function F = myFun_y(b, tt)
F = b(1)*sin(b(2)*tt);
end
Note that this is a particularly bad way to fit parametric curves, as is apparent here by the extreme sensitivity of the solution to the quality of the initial values to lsqcurvefit. Nevertheless, fitting a parametric curve will be the way to go.
There's your google query :)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to fit polynomial into some error bars data - matlab

Related

Constrained linear least squares not fitting data

Multivariate Normal Distribution Matlab, probability area

Least squares fit, unknown intercerpt

Parabola plot with data in Octave

matlab determine curve

Categories

Resources