This question already has an answer here:
How do I determine the coefficients for a linear regression line in MATLAB? [closed]
(1 answer)
Closed 7 years ago.
How could I make a linear regression with several value equals on x with MATLAB?
Now, an example with minimal data (not the data I use) :
y = [1,2,3,4,5,6,7,8,9,10];
x = [2,2,2,4,4,6,6,6,10,10];
If I use polyfit or \:
x = temp(:,1); y = temp(:,2);
b1 = x\y;
yCalc1 = b1*x;
plot(x,yCalc1,'-r');
Then the linear regression is wrong because (I suppose) he didn't notice that several values have got the same (x).
Here, a graph with my real data. Blue dots: my data. Red line : the linear regression (it's wrong). Don't focus to green dash line:
And here, the "same" graph (done with Excel):
Blue dots: my data. Red line : the linear regression (it's right)
Du you think that if I do a mean for each yvalues with the same x, it's mathematicaly right ?
If you intended to solve simple linear regression with matrix form Y= XB and the operator \, you need to add an additional column of ones in your X for calculating the intercepts.
y0 = [1,2,3,4,5,6,7,8,9,10];
x0 = [2,2,2,4,4,6,6,6,10,10];
X1 = [ones(length(x0),1) x0'];
b = X1\y0';
y = b(1) + x0*b(2)
plot(x0,y0,'o')
hold on
plot(x0,y,'--r')
You can find a good Matlab example here
So, Dan suggests me a function and it's working now.
If you want to do the same thing, just do like that :
use the fitlm function (http://fr.mathworks.com/help/stats/fitlm.html?refresh=true#bunfd6c-2)
Example datas :
y = [1,2,3,4,5,6,7,8,9,10];
x = [2,2,2,4,4,6,6,6,10,10];
tbl = table(x,y)
lm = fitlm(tbl,'linear')
and you will have different values.
A linear regression is an equation as y = ax + b. Here, on result, a correspond to x (bellow equal to 0.15663) and b correspond to (Intercept) (bellow equal to 1.4377).
With other values, Matlab will show you this result :
Linear regression model:
y ~ 1 + x
Estimated Coefficients:
Estimate SE tStat pValue
________ _________ ______ ___________
(Intercept) 1.4377 0.031151 46.151 5.8802e-290
x 0.15663 0.0054355 28.816 1.2346e-145
Number of observations: 1499, Error degrees of freedom: 1497
Root Mean Squared Error: 0.135
R-squared: 0.357, Adjusted R-Squared 0.356
F-statistic vs. constant model: 830, p-value = 1.23e-145
Thank's to Dan again !
Related
I am pretty new to MATLAB computation and would appreciate any help on this. Firstly, I am trying to integrate a cosine function using the McLaurin expansion [cos(z) = 1 − (z^2)/2 + (z^4)/4 + ....] and lets say plotted over one cycle from 0 to 2π. This first integral (as seen in Figure 1) would serve as a "reference" or "contour" for what I want to do next.
Figure 1 - "The first integral representation"
Now, my next problem comes from writing in MATLAB cos(z) in
terms of an integral in the complex plane.
Figure 2 - "showing cos(z) as an integral in the complex plane"
Where I could choose an equi-sampled set of points 'sn' along the contour, from 'sL' to 'sU'
and separated by '∆s'. This is Euler’s method.
I am trying to write a code to numerically approximate the integral
along a suitable contour and plot the approximation against the
exact value of cos(z). I would like to do this twice - once for
z ∈ [0, 6π] and once for complex valued z in the range
z ∈ [0 + i, 6π + ]. Then to plot both the real and imaginary part of the
computed cos(z)
I am aware of the steps I am looking to implement which I'll bullet-point here.
Choose γ, SL, SU , N.
Step through z from z lower to z upper (use a different
number of steps (other than N) for this).
For each value of z compute cos z by stepping along the contour in
N discrete steps from SL to SU .
For each value of sn along the contour compute the integrand e^(sn-(z^2/4sn))/sqrt(sn) and add it to the rolling sum [I have attached figure 3 showing an image formula of the integrand if its not clear!] Figure 3 - "The exponential integrand I am looking to compute"
Now I will show what I have attempted in MATLAB!
% "Contour Integral Representation of Cosine"
% making 'z' a possible number such as pi
N = 10000000; % example number - meaning sample of steps
z_lower = 0;
z_upper = 6*pi;
%==========================%
z1 = linspace(z_lower,z_upper,N);
y = 1;
Sl = y - 10*1i;
sum = 0.0;
%==========================%
for z = linspace(z_lower,z_upper,N)
for Sn = linspace(Sl,Su,N)
sum = sum + ((exp(Sn) - (z.^2/4*Sn))/sqrt(Sn))*ds;
end
end
sum = sum*(sqrt(pi)/2*pi*1i);
plot(Sn,sum)
Edit1: Hiya, this figure will show what I am expecting - the numerical method to be not exactly the same as the "symbolic" integration lets say. In figure 4, the black cosine wave is the same as in figure 1 and the blue is the numerical integration method.Figure 4 - "End Result of what I expect to plot
I would like to compute the derivative of a complex-valued function (Holomorphic function) numerically in MATLAB.
I have computed the function in a grid on the complex plane, and I've tried to compute the derivative using the Cauchy–Riemann relations.
Given:
u = real(f), v = imag(f), x = real(point), y = imag(point)
The derivative should be given by: f' = du/dx + i dv/dx = dv/dy - i du/dy
where 'd' is the derivative operator.
I've tried the following code:
stepx = 0.01;
stepy = 0.01;
Nx = 2/stepx +1;
Ny = 2/stepy +1;
[re,im] = meshgrid([-1:stepx:1], [-1:stepy:1]);
cplx = re + 1i*im;
z = cplx.^3;
The derivative should be given by:
f1 = diff(real(z),1,2)/stepx +1i* diff(imag(z),1,2)/stepx;
or
f2 = diff(imag(z),1,1)/stepy - 1i* diff(real(z),1,1)/stepy;
But the two derivatives, which are suppose to be equal, do not match.
What am I doing wrong?
Let's count the number of elements which differs for less than stepx (assuming stepx = stepy):
lm = min(size(f1));
A = f1(1:lm,1:lm);
B = f2(1:lm,1:lm);
sum(sum(abs(A - B) <= stepx))
and using the fix proposed by #A. Donda
f1i = interp1(1 : Ny, f1, 1.5 : Ny);
f2i = interp1(1 : Nx, f2 .', 1.5 : Nx) .';
sum(sum(abs(f1i - f2i) <= stepx))
In the second case they all differ for less than stepx as it should be, while in the first case it's not true.
The problem is that using the two different expressions, discretized for use in Matlab, you are computing the approximate derivatives at different points in the complex plane. Let's say you are at imaginary value y and you compute the differences along the real axis x, then the ith difference estimates the derivative at (x(i) + x(i + 1))/2, i.e. at all the midpoints between two subsequent x-values. The other way around you estimate the derivative at a given x but at all the midpoints between two subsequent y-values.
This also leads to different sizes of the resulting matrices. Using the first formula you get a matrix of size 201x200, the other of size 200x201. That's because in the first variant there are 200 midpoints along x, but 201 y-values, and vice versa.
So the answer is, you are not doing anything wrong, you are just interpreting the result wrongly.
You can solve the problem by interpolating explicitly along the other dimension (the one not used for the derivative):
f1i = interp1(1 : Ny, f1, 1.5 : Ny);
f2i = interp1(1 : Nx, f2 .', 1.5 : Nx) .';
Where f1 is computed according to your first formula and f2 according to the second. Now both derivatives are evaluated at points that are midpoints along both dimensions, which is why both matrices are of size 200x200.
If you compare them now, you will see that they are identical up to numerical error (after all, diff computes only approximate derivatives and interp1 makes interpolation errors). For your stepsize, this error is maximally 1e-4, and it can be further reduced by using a smaller stepsize.
I am trying to understand the normalization and "unnormalization" steps in the direct least squares ellipse fitting algorithm developed by Fitzgibbon, Pilu and Fisher (improved by Halir and Flusser).
EDITED: More details about theory added. Is eigenvalue problem where confusion stems from?
Short theory:
The ellipse is represented by an implicit second order polynomial (general conic equation):
where:
To constrain this general conic to an ellipse, the coefficients must satisfy the quadratic constraint:
which is equivalent to:
where C is a matrix of zeros except:
The design matrix D is composed of all data points x sub i.
The minimization of the distance between a conic and the data points can be expressed by a generalized eigenvalue problem (some theory has been omitted):
Denoting:
We now have the system:
If we solve this system, the eigenvector corresponding to the single positive eigenvalue is the correct answer.
The code:
The code snippets here are directly from the MATLAB code provided by the authors:
http://research.microsoft.com/en-us/um/people/awf/ellipse/fitellipse.html
The data input is a series of (x,y) points. The points are normalized by subtracting the mean and dividing by the standard deviation (in this case, computed as half the range). I'm assuming this normalization allows for a better fit of the data.
% normalize data
% X and Y are the vectors of data points, not normalized
mx = mean(X);
my = mean(Y);
sx = (max(X)-min(X))/2;
sy = (max(Y)-min(Y))/2;
x = (X-mx)/sx;
y = (Y-my)/sy;
% Build design matrix
D = [ x.*x x.*y y.*y x y ones(size(x)) ];
% Build scatter matrix
S = D'*D; %'
% Build 6x6 constraint matrix
C(6,6) = 0; C(1,3) = -2; C(2,2) = 1; C(3,1) = -2;
[gevec, geval] = eig(S,C);
% Find the negative eigenvalue
I = find(real(diag(geval)) < 1e-8 & ~isinf(diag(geval)));
% Extract eigenvector corresponding to negative eigenvalue
A = real(gevec(:,I));
After this, the normalization is reversed on the coefficients:
par = [
A(1)*sy*sy, ...
A(2)*sx*sy, ...
A(3)*sx*sx, ...
-2*A(1)*sy*sy*mx - A(2)*sx*sy*my + A(4)*sx*sy*sy, ...
-A(2)*sx*sy*mx - 2*A(3)*sx*sx*my + A(5)*sx*sx*sy, ...
A(1)*sy*sy*mx*mx + A(2)*sx*sy*mx*my + A(3)*sx*sx*my*my ...
- A(4)*sx*sy*sy*mx - A(5)*sx*sx*sy*my ...
+ A(6)*sx*sx*sy*sy ...
]';
At this point, I'm not sure what happened. Why is the unnormalization of the last three coefficients of A (d, e, f) dependent on the first three coefficients? How do you mathematically show where these unnormalization equations come from?
The 2 and 1 coefficients in the unnormalization lead me to believe the constraint matrix must be involved somehow.
Please let me know if more detail is needed on the method...it seems I'm missing how the normalization has propagated through the matrices and eigenvalue problem.
Any help is appreciated. Thanks!
At first, let me formalize the problem in a homogeneous space (as used in Richard Hartley and Andrew Zisserman's book Multiple View Geometry):
Assume that,
P=[X,Y,1]'
is our point in the unnormalized space, and
p=lambda*[x,y,1]'
is our point in the normalized space, where lambda is an unimportant free scale (in homogeneous space [x,y,1]=[10*x,10*y,10] and so on).
Now it is clear that we can write
x = (X-mx)/sx;
y = (Y-my)/sy;
as a simple matrix equation like:
p=H*P; %(equation (1))
where
H=[1/sx, 0, -mx/sx;
0, 1/sy, -my/sy;
0, 0, 1];
Also we know that an ellipse with the equation
A(1)*x^2 + A(2)*xy + A(3)*y^2 + A(4)*x + A(5)*y + A(6) = 0 %(first representation)
can be written in matrix form as:
p'*C*p=0 %you can easily verify this by matrix multiplication
where
C=[A(1), A(2)/2, A(4)/2;
A(2)/2, A(3), A(5)/2;
A(4)/2, A(5)/2, A(6)]; %second representation
and
p=[x,y,1]
and it is clear that these two representations of an ellipse are exactly the same and equivalent.
Also we know that the vector A=[A(1),A(2),A(3),A(4),A(5),A(6)] is a type-1 representation of the ellipse in the normalized space.
So we can write:
p'*C*p=0
where p is the normalized point and C is as defined previously.
Now we can use the "equation (1): p=HP" to derive some good result:
(H*P)'*C*(H*P)=0
=====>
P'*H'*C*H*P=0
=====>
P'*(H'*C*H)*P=0
=====>
P'*(C1)*P=0 %(equation (2))
We see that the equation (2) is an equation of an ellipse in the unnormalized space where C1 is the type-2 representation of ellipse and we know that:
C1=H'*C*H
Ans also, because the equation (2) is a zero equation we can multiply it by any non-zero number. So we multiply it by sx^2*sy^2 and we can write:
C1=sx^2*sy^2*H'*C*H
And finally we get the result
C1=[ A(1)*sy^2, (A(2)*sx*sy)/2, (A(4)*sx*sy^2)/2 - A(1)*mx*sy^2 - (A(2)*my*sx*sy)/2;
(A(2)*sx*sy)/2, A(3)*sx^2, (A(5)*sx^2*sy)/2 - A(3)*my*sx^2 - (A(2)*mx*sx*sy)/2;
-(- (A(4)*sx^2*sy^2)/2 + (A(2)*my*sx^2*sy)/2 + A(1)*mx*sx*sy^2)/sx, -(- (A(5)*sx^2*sy^2)/2 + A(3)*my*sx^2*sy + (A(2)*mx*sx*sy^2)/2)/sy, (mx*(- (A(4)*sx^2*sy^2)/2 + (A(2)*my*sx^2*sy)/2 + A(1)*mx*sx*sy^2))/sx + (my*(- (A(5)*sx^2*sy^2)/2 + A(3)*my*sx^2*sy + (A(2)*mx*sx*sy^2)/2))/sy + A(6)*sx^2*sy^2 - (A(4)*mx*sx*sy^2)/2 - (A(5)*my*sx^2*sy)/2]
which can be transformed into the type-2 ellipse and get the exact result we were looking for:
[ A(1)*sy^2, A(2)*sx*sy, A(3)*sx^2, A(4)*sx*sy^2 - 2*A(1)*mx*sy^2 - A(2)*my*sx*sy, A(5)*sx^2*sy - 2*A(3)*my*sx^2 - A(2)*mx*sx*sy, A(2)*mx*my*sx*sy + A(1)*mx*my*sy^2 + A(3)*my^2*sx^2 + A(6)*sx^2*sy^2 - A(4)*mx*sx*sy^2 - A(5)*my*sx^2*sy]
If you are curious how I managed to caculate these time-consuming equations I can give you the matlab code to do it for you as follows:
syms sx sy mx my
syms a b c d e f
C=[a, b/2, d/2;
b/2, c, e/2;
d/2, e/2, f];
H=[1/sx, 0, -mx/sx;
0, 1/sy, -my/sy;
0, 0, 1];
C1=sx^2*sy^2*H.'*C*H
a=[Cp(1,1), 2*Cp(1,2), Cp(2,2), 2*Cp(1,3), 2*Cp(2,3), Cp(3,3)]
I have a set of data with independent variable x and y. Now I'm trying to build a two dimensional regression model that has a regression surface cutting through my data points. However, I couldn't find a way to achieve this. Can anyone give me some assistance?
You could use my favorite, polyfitn for linear or polynomial models. If you would like a different model, please edit your question or add a comment. HTH!
EDIT
Also, take a look here under Multiple Regression, likely can help you as well.
EDIT AGAIN
Sorry, I'm having too much fun with this, here's an example of multivariate regression using least squares with stock Matlab:
t = (1:10)';
x = t;
y = exp(-t);
A = [ y x ];
z = 10*y + 0.5*x;
A\z
ans =
10.0000
0.5000
If you are performing linear regression, the best tool is the regress function. Note that, if you are fitting a model of the form y(x1,x2) = b1.f(x1) + b2.g(x2) + b3 this is still a linear regression, as long as you know the functions f and g.
Nsamp = 100; %number of samples
X1 = randn(Nsamp,1); %regressor 1 (could also be some computed f(x1) )
X2 = randn(Nsamp,1); %regressor 2 (could also be some computed g(x2) )
Y = X1 + X2 + randn(Nsamp,1); %generate some data to be regressed
%now run the regression
[b,bint,r,rint,stats] = regress(Y,[X1 X2 ones(Nsamp,1)]);
% 'b' contains the coefficients, b1,b2,b3 of the fit; can be used to plot regression surface)
% 'r' contains residuals of the fit
% 'stats' contains the overall regression R^2, F stat, p-value and error variance
I am trying to finish an assignment and I don't really know how to do what the question asks. I am not looking for a complete answer, but just an understanding on what I need to use/do to solve the question. Here is the question:
We are asked to provide an interpolant for the Bessel function of the first
kind of order zero, J0(x).
(a) Create a table of data points listed to 7 decimal places for the interpolation points
x1 = 1.0, x2 = 1.3, x3 = 1.6, x4 = 1.9, x5 = 2.2.
[Hint: See Matlab's help on BesselJ.]
(b) Fit a second-degree polynomial through the points x1, x2, x3. Use this interpolant
to estimate J0(1.5). Compute the error.
What exactly does BesselJ do? And how do I fit a second degree polynomial through the three points?
Thanks,
Mikeshiny
Here's the zeroth order Bessel function of the first kind:
http://mathworld.wolfram.com/BesselFunctionoftheFirstKind.html
Bessel functions are to differential equations in cylindrical coordinates as sines and cosines are to ODEs in rectangular coordinates.
Both have series representations; both have polynomial approximations.
Here's a general second-order polynomial:
y = a0 + a1*x + a2*x^2
Substitute in three points (x1, y1), (x2, y2), and (x3, y3) and you'll have three equations for three unknown coefficients a0, a1, and a2. Solve for those coefficients.
Take a look at the plot of y = J0(x) in the link I gave you. You want to fit a 2nd order poly through some range. So - pick one. The first point is (0, 1). Pick two more - maybe x = 1 and x = 2. Look up the values for y at those values of x from a J0 table and evaluate your coefficients.
Here are my three points: (0,1), (1, 0.7652), (2.4048, 0).
When I calculate the coefficients, here's the 2nd order polynomial I get:
J0(x) = 1 -0.105931124*x -0.128868876*x*x