Fit f(x,y,z)=0 to a set of 3D points using MATLAB - matlab

The statement is the following:
Given a 3D set of points (x,y,z), fit a surface defined by a certain number of parameters IMPLICITLY
I'm definitely no expert in programming, but I need to get this done in one way or another. I've considered other programs, such as OriginPro, which can solve this problem pretty easily, but I want to have it done in MATLAB.
The surface is defined by:
A*x^2+B*y^2+C*z^2+D*x+E*y+F*z+G*xy+H*xz+I*yz+J=0
Considering that the Curve Fitting Toolbox can only fit explicit functions, what would you guys suggest?
IMPORTANT REMARK: I'm not asking for a solution, just advice on how to proceed

This can be chalked up to solving a linear system of equations where each point forms a constraint or equation in your system. You would thus find the right set of coefficients in your surface equation that satisfies all points. Using the equation in your question as is, one would find the null space of the linear system that satisfies the surface equation. What you'll have to do is given a set of m points that contain x, y and z coordinates, we can reformulate the above equation as a matrix-vector multiplication with the first argument technically being a matrix of one row and the vector being the coefficients that fit your plane. This is important before we proceed to the null space part of this problem.
In particular, you can agree with me that we can represent the above in the following matrix-vector multiplication:
[x^2 y^2 z^2 x y z xy xz yz 1][A] = [0]
[B]
[C]
[D]
[E]
[F]
[G]
[H]
[I]
[J]
Our objective is to find the coefficients A, B, ..., J that would satisfy the constraint above. Now moving onto the more general case, since you have m points, we can build our linear system and thus a matrix of coefficients on the left side of this expression:
[x_1^2 y_1^2 z_1^2 x_1 y_1 z_1 x_1*y_1 x_1*z_1 y_1*z_1 1][A] = [0]
[x_2^2 y_2^2 z_2^2 x_2 y_2 z_2 x_2*y_2 x_2*z_2 y_2*z_2 1][B] = [0]
[x_3^2 y_3^2 z_3^2 x_3 y_3 z_3 x_3*y_3 x_3*z_3 y_3*z_3 1][C] = [0]
... [D] = [0]
... [E] = [0]
... [F] = [0]
... [G] = [0]
... [H] = [0]
... [I] = [0]
[x_m^2 y_m^2 z_m^2 x_m y_m z_m x_m*y_m x_m*z_m y_m*z_m 1][J] = [0]
We now build this linear system, and solve to find our coefficients. The trick is to build the matrix that you see on the left hand side of this linear system, which I will call M. Each row is such that you create [x_i^2 y_i^2 z_i^2 x_i y_i z_i x_i*y_i x_i*z_i y_i*z_i 1] with x_i, y_i and z_i being the ith (x,y,z) coordinate in your dataset.
Once you build this, you would thus find the null space of this system. There are many methods to do this in MATLAB. One way is to simply use the null function on the matrix you build above and it will return to you a matrix where each column is a potential solution to the surface you are fitting above. That is, each column directly corresponds to the coefficients A, B, ..., J that would fit your data to the surface. You can also try using the singular value decomposition or QR decomposition if you like, but the null function is a good place to start as it uses singular value decomposition already.
I would like to point out that the above will only work if you provide a matrix that is not full rank. To simplify things, this can happen if the number of points you have is less than the number of parameters you have. Therefore, this method would only work if you had up to 9 points in your data. If you have exactly 9, then this method will work very well. If you have less than 9, then there will be more potential solutions as the number of degrees of freedom increases. Specifically, you will have 10 - m possible solutions and any of those solutions is valid. If you have more than 10 points and they are all unique, this would be considered a full rank matrix and thus the only solution to the null space is the trivial one with the coefficients being all set to 0.
In order to escape the possibility of the null space being all 0, or the possibility of the null space providing more than one solution, you probably just want one solution, and you most likely have 10 or more possible points that you want to fit your data with. An alternative method that I can provide is simply an extension of the above but we don't need to find the null space. Specifically, you relax one of the coefficients, say J and you can set that to any value you wish. For example, set it to J = 1. Therefore, the system of equations now changes where J disappears from the mix and it now appears on the right side of the system:
[x_1^2 y_1^2 z_1^2 x_1 y_1 z_1 x_1*y_1 x_1*z_1 y_1*z_1][A] = [-1]
[x_2^2 y_2^2 z_2^2 x_2 y_2 z_2 x_2*y_2 x_2*z_2 y_2*z_2][B] = [-1]
[x_3^2 y_3^2 z_3^2 x_3 y_3 z_3 x_3*y_3 x_3*z_3 y_3*z_3][C] = [-1]
... [D] = [-1]
... [E] = [-1]
... [F] = [-1]
... [G] = [-1]
... [H] = [-1]
[x_m^2 y_m^2 z_m^2 x_m y_m z_m x_m*y_m x_m*z_m y_m*z_m][I] = [-1]
You can thus find the parameters A, B, ..., I using linear least squares where the solution can be solved using the pseudoinverse. The benefit with this approach is that because the matrix is full rank, there is one and only one solution, thus being unique. Additionally, this formulation is nice because if there is an exact solution to the linear system, solving with the pseudoinverse will provide the exact solution. If there is no exact solution to the system, meaning that not all constraints are satisfied, the solution provided is one that minimizes the least squared error between the data and the parameters that were fit with that data.
MATLAB already has an awesome utility to solve a system through linear least squares - in fact, the core functionality of MATLAB is to solve linear algebra problems (if you didn't know that already). You can use matrix left division to solve the problem. Simply put, given that you built your matrix of coefficients above after you introduce the relaxation of J also being called M, the solution to the problem is simply coeff = M\(-ones(m,1)); with m being the number of points and coeff being the coefficients for the surface equation that fit your points. The ones statement in the code creates a column vector of ones that are negative that has m elements.
Using the least squares approach has a more stable and unique solution as you are specifically constraining one of the coefficients, J, to be 1. Using the null space approach only works if you have less points than you do parameters and will possibly give you more than one solution so long as the coefficients span the null space. Specifically, you will get 10 - m solutions and they are all equally good at fitting your data.
I hope this is enough to get you started and good luck!

Related

Working with Givens rotations

If we consider a matrix R of size pxp. If we want to multiply A'RA where A is equal to (I+Givens rotation). Here I is an identity matrix and ' denotes the transpose operator.
We know that a Givens rotation is a sparse matrix written as:
To perform the multiplication A'RA in matlab, we can do this fast implementation:
%Fast implementation
ci = R(:,ik)*(cos(theta))+R(:,jk)*(sin(theta)); % R*A
cj = R(:,jk)*(cos(theta)) - R(:,ik)*(sin(theta));
R(:,ik) = ci;
R(:,jk) = cj;
ri = R(ik,:)*(cos(theta))+R(jk,:)*(sin(theta)); % A'*R*A
rj = R(jk,:)*(cos(theta)) - R(ik,:)*(sin(theta));
R(ik,:) = ri;
R(jk,:) = rj;
But I didn't understand how they wrote this Matlab code. In other terms, I am not understanding how this Matlab code apply the multiplication A'RA. Kindly, can someone help me to understand this code?
One possible source of confusion is that either the signs in the Givens rotation matrix, or the side on which we need to transpose, is wrong in your example. I'll assume the latter: I'll use the same A matrix as you defined, but transform with A*R*A' (changing the A to transpose is equivalent to taking the rotation angle with opposite sign).
The algorithm is relatively straightforward. For starters, as the comments in the code suggest, the transformation is performed in two steps:
Rnew = A * R * A' = A * (R * A')
First, we compute R*A'. For this, imagine the transformation matrix A = I + M with the Givens rotation matrix M. The formula which you showed basically says "Take a unit matrix, except for 2 specified dimensions in which you rotate by a given angle". Here's how the full A matrix looks like for a small problem (6d matrix, ik=2, jk=4, both in full and sparse form):
You can see that except for the (ik,jk) 2d subspace, this matrix is a unit matrix, leaving every other dimension intact. So the action of R*A' will result in R for every dimension except for columns ik and jk.
In these two columns the result of R*A' is the linear combination of R(:,ik) and R(:,jk) with these trigonometric coefficients:
[R*A'](:,ik) = R(:,ik)*cos(theta) + R(:,jk)*sin(theta)
[R*A'](:,jk) = -R(:,ik)*sin(theta) + R(:,jk)*cos(theta)
while the rest of the columns are left unchanged. If you look at the code you cited: this is exactly what it's doing. This is, by definition, what R*A' means with the A matrix shown above. All of this is the implication of that the A matrix is a unit matrix except for a 2d subspace.
The next step is then quite similar: using this new R*A' matrix we multiply from the left with A. Again, the effect along most of the dimensions (rows, this time) will be identity, but in rows ik and jk we again get a linear combination:
[A*[R*A']](ik,:) = cos(theta)*[R*A'](ik,:) + sin(theta)*[R*A'](jk,:)
[A*[R*A']](jk,:) = -sin(theta)*[R*A'](ik,:) + cos(theta)*[R*A'](jk,:)
By noting that the code overwrites the R matrix with R*A' after the first step, it's again clear that the same is performed in the "fast implementation" code.
Disclaimer: A' is the adjoint (conjugate transpose) in matlab, so you should use A.' to refer to the transpose. For complex matrices there's a huge difference, and people often forget to use the proper transpose when eventually encountering complex matrices.

Rotate a basis to align to vector

I have a matrix M of size NxP. Every P columns are orthogonal (M is a basis). I also have a vector V of size N.
My objective is to transform the first vector of M into V and to update the others in order to conservate their orthogonality. I know that the origins of V and M are the same, so it is basically a rotation from a certain angle. I assume we can find a matrix T such that T*M = M'. However, I can't figure out the details of how to do it (with MATLAB).
Also, I know there might be an infinite number of transforms doing that, but I'd like to get the simplest one (in which others vectors of M approximately remain the same, i.e no rotation around the first vector).
A small picture to illustrate. In my actual case, N and P can be large integers (not necessarily 3):
Thanks in advance for your help!
[EDIT] Alternative solution to Gram-Schmidt (accepted answer)
I managed to get a correct solution by retrieving a rotation matrix R by solving an optimization problem minimizing the 2-norm between M and R*M, under the constraints:
V is orthogonal to R*M[1] ... R*M[P-1] (i.e V'*(R*M[i]) = 0)
R*M[0] = V
Due to the solver constraints, I couldn't indicate that R*M[0] ... R*M[P-1] are all pairwise orthogonal (i.e (R*M)' * (R*M) = I).
Luckily, it seems that with this problem and with my solver (CVX using SDPT3), the resulting R*M[0] ... R*M[P-1] are also pairwise orthogonal.
I believe you want to use the Gram-Schmidt process here, which finds an orthogonal basis for a set of vectors. If V is not orthogonal to M[0], you can simply change M[0] to V and run Gram-Schmidt, to arrive at an orthogonal basis. If it is orthogonal to M[0], instead change another, non-orthogonal vector such as M[1] to V and swap the columns to make it first.
Mind you, the vector V needs to be in the column space of M, or you will always have a different basis than you had before.
Matlab doesn't have a built-in Gram-Schmidt command, although you can use the qr command to get an orthogonal basis. However, this won't work if you need V to be one of the vectors.
Option # 1 : if you have some vector and after some changes you want to rotate matrix to restore its orthogonality then, I believe, this method should work for you in Matlab
http://www.mathworks.com/help/symbolic/mupad_ref/numeric-rotationmatrix.html
(edit by another user: above link is broken, possible redirect: Matrix Rotations and Transformations)
If it does not, then ...
Option # 2 : I did not do this in Matlab but a part of another task was to find Eigenvalues and Eigenvectors of the matrix. To achieve this I used SVD. Part of SVD algorithm was Jacobi Rotation. It says to rotate the matrix until it is almost diagonalizable with some precision and invertible.
https://math.stackexchange.com/questions/222171/what-is-the-difference-between-diagonalization-and-orthogonal-diagonalization
Approximate algorithm of Jacobi rotation in your case should be similar to this one. I may be wrong at some point so you will need to double check this in relevant docs :
1) change values in existing vector
2) compute angle between actual and new vector
3) create rotation matrix and ...
put Cosine(angle) to diagonal of rotation matrix
put Sin(angle) to the top left corner of the matric
put minus -Sin(angle) to the right bottom corner of the matrix
4) multiple vector or matrix of vectors by rotation matrix in a loop until your vector matrix is invertible and diagonalizable, ability to invert can be calculated by determinant (check for singularity) and orthogonality (matrix is diagonalized) can be tested with this check - if Max value in LU matrix is less then some constant then stop rotation, at this point new matrix should contain only orthogonal vectors.
Unfortunately, I am not able to find exact pseudo code that I was referring to in the past but these links may help you to understand Jacobi Rotation :
http://www.physik.uni-freiburg.de/~severin/fulltext.pdf
http://web.stanford.edu/class/cme335/lecture7.pdf
https://www.nada.kth.se/utbildning/grukth/exjobb/rapportlistor/2003/rapporter03/maleko_mercy_03003.pdf

Exponential curve fit matlab

I have the following equation:
I want to do a exponential curve fitting using MATLAB for the above equation, where y = f(u,a). y is my output while (u,a) are my inputs. I want to find the coefficients A,B for a set of provided data.
I know how to do this for simple polynomials by defining states. As an example, if states= (ones(size(u)), u u.^2), this will give me L+Mu+Nu^2, with L, M and N being regression coefficients.
However, this is not the case for the above equation. How could I do this in MATLAB?
Building on what #eigenchris said, simply take the natural logarithm (log in MATLAB) of both sides of the equation. If we do this, we would in fact be linearizing the equation in log space. In other words, given your original equation:
We get:
However, this isn't exactly polynomial regression. This is more of a least squares fitting of your points. Specifically, what you would do is given a set of y and set pair of (u,a) points, you would build a system of equations and solve for this system via least squares. In other words, given the set y = (y_0, y_1, y_2,...y_N), and (u,a) = ((u_0, a_0), (u_1, a_1), ..., (u_N, a_N)), where N is the number of points that you have, you would build your system of equations like so:
This can be written in matrix form:
To solve for A and B, you simply need to find the least-squares solution. You can see that it's in the form of:
Y = AX
To solve for X, we use what is called the pseudoinverse. As such:
X = A^{*} * Y
A^{*} is the pseudoinverse. This can eloquently be done in MATLAB using the \ or mldivide operator. All you have to do is build a vector of y values with the log taken, as well as building the matrix of u and a values. Therefore, if your points (u,a) are stored in U and A respectively, as well as the values of y stored in Y, you would simply do this:
x = [u.^2 a.^3] \ log(y);
x(1) will contain the coefficient for A, while x(2) will contain the coefficient for B. As A. Donda has noted in his answer (which I embarrassingly forgot about), the values of A and B are obtained assuming that the errors with respect to the exact curve you are trying to fit to are normally (Gaussian) distributed with a constant variance. The errors also need to be additive. If this is not the case, then your parameters achieved may not represent the best fit possible.
See this Wikipedia page for more details on what assumptions least-squares fitting takes:
http://en.wikipedia.org/wiki/Least_squares#Least_squares.2C_regression_analysis_and_statistics
One approach is to use a linear regression of log(y) with respect to u² and a³:
Assuming that u, a, and y are column vectors of the same length:
AB = [u .^ 2, a .^ 3] \ log(y)
After this, AB(1) is the fit value for A and AB(2) is the fit value for B. The computation uses Matlab's mldivide operator; an alternative would be to use the pseudo-inverse.
The fit values found this way are Maximum Likelihood estimates of the parameters under the assumption that deviations from the exact equation are constant-variance normally distributed errors additive to A u² + B a³. If the actual source of deviations differs from this, these estimates may not be optimal.

Sparse diagonal matrix solver

I want to solve, in MatLab, a linear system (corresponding to a PDE system of two equations written in finite difference scheme). The action of the system matrix (corresponding to one of the diffusive terms of the PDE system) reads, symbolically (u is one of the unknown fields, n is the time step, j is the grid point):
and fully:
The above matrix has to be intended as A, where A*U^n+1 = B is the system. U contains the 'u' and the 'v' (second unknown field of the PDE system) alternatively: U = [u_1,v_1,u_2,v_2,...,u_J,v_J].
So far I have been filling this matrix using spdiags and diag in the following expensive way:
E=zeros(2*J,1);
E(1:2:2*J) = 1;
E(2:2:2*J) = 0;
Dvec=zeros(2*J,1);
for i=3:2:2*J-3
Dvec(i)=D_11((i+1)/2);
end
for i=4:2:2*J-2
Dvec(i)=D_21(i/2);
end
A = diag(Dvec)*spdiags([-E,-E,2*E,2*E,-E,-E],[-3,-2,-1,0,1,2],2*J,2*J)/(dx^2);`
and for the solution
[L,U]=lu(A);
y = L\B;
U(:) =U\y;
where B is the right hand side vector.
This is obviously unreasonably expensive because it needs to build a JxJ matrix, do a JxJ matrix multiplication, etc.
Then comes my question: is there a way to solve the system without passing MatLab a matrix, e.g., by passing the vector Dvec or alternatively directly D_11 and D_22?
This would spare me a lot of memory and processing time!
Matlab doesn't store sparse matrices as JxJ arrays but as lists of size O(J). See
http://au.mathworks.com/help/matlab/math/constructing-sparse-matrices.html
Since you are using the spdiags function to construct A, Matlab should already recognize A as sparse and you should indeed see such a list if you display A in console view.
For a tridiagonal matrix like yours, the L and U matrices should already be sparse.
So you just need to ensure that the \ operator uses the appropriate sparse algorithm according to the rules in http://au.mathworks.com/help/matlab/ref/mldivide.html. It's not clear whether the vector B will already be considered sparse, but you could recast it as a diagonal matrix which should certainly be considered sparse.

How do I determine the coefficients for a linear regression line in MATLAB? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I'm going to write a program where the input is a data set of 2D points and the output is the regression coefficients of the line of best fit by minimizing the minimum MSE error.
I have some sample points that I would like to process:
X Y
1.00 1.00
2.00 2.00
3.00 1.30
4.00 3.75
5.00 2.25
How would I do this in MATLAB?
Specifically, I need to get the following formula:
y = A + Bx + e
A is the intercept and B is the slope while e is the residual error per point.
Judging from the link you provided, and my understanding of your problem, you want to calculate the line of best fit for a set of data points. You also want to do this from first principles. This will require some basic Calculus as well as some linear algebra for solving a 2 x 2 system of equations. If you recall from linear regression theory, we wish to find the best slope m and intercept b such that for a set of points ([x_1,y_1], [x_2,y_2], ..., [x_n,y_n]) (that is, we have n data points), we want to minimize the sum of squared residuals between this line and the data points.
In other words, we wish to minimize the cost function F(m,b,x,y):
m and b are our slope and intercept for this best fit line, while x and y are a vector of x and y co-ordinates that form our data set.
This function is convex, so there is an optimal minimum that we can determine. The minimum can be determined by finding the derivative with respect to each parameter, and setting these equal to 0. We then solve for m and b. The intuition behind this is that we are simultaneously finding m and b such that the cost function is jointly minimized by these two parameters. In other words:
OK, so let's find the first quantity :
We can drop the factor 2 from the derivative as the other side of the equation is equal to 0, and we can also do some distribution of terms by multiplying the -x_i term throughout:
Next, let's tackle the next parameter :
We can again drop the factor of 2 and distribute the -1 throughout the expression:
Knowing that is simply n, we can simplify the above to:
Now, we need to simultaneously solve for m and b with the above two equations. This will jointly minimize the cost function which finds the best line of fit for our data points.
Doing some re-arranging, we can isolate m and b on one side of the equations and the rest on the other sides:
As you can see, we can formulate this into a 2 x 2 system of equations to solve for m and b. Specifically, let's re-arrange the two equations above so that it's in matrix form:
With regards to above, we can decompose the problem by solving a linear system: Ax = b. All you have to do is solve for x, which is x = A^{-1}*b. To find the inverse of a 2 x 2 system, given the matrix:
The inverse is simply:
Therefore, by substituting our quantities into the above equation, we solve for m and b in matrix form, and it simplifies to this:
Carrying out this multiplication and solving for m and b individually, this gives:
As such, to find the best slope and intercept to best fit your data, you need to calculate m and b using the above equations.
Given your data specified in the link in your comments, we can do this quite easily:
%// Define points
X = 1:5;
Y = [1 2 1.3 3.75 2.25];
%// Get total number of points
n = numel(X);
% // Define relevant quantities for finding quantities
sumxi = sum(X);
sumyi = sum(Y);
sumxiyi = sum(X.*Y);
sumxi2 = sum(X.^2);
sumyi2 = sum(Y.^2);
%// Determine slope and intercept
m = (sumxi * sumyi - n*sumxiyi) / (sumxi^2 - n*sumxi2);
b = (sumxiyi * sumxi - sumyi * sumxi2) / (sumxi^2 - n*sumxi2);
%// Display them
disp([m b])
... and we get:
0.4250 0.7850
Therefore, the line of best fit that minimizes the error is:
y = 0.4250*x + 0.7850
However, if you want to use built-in MATLAB tools, you can use polyfit (credit goes to Luis Mendo for providing the hint). polyfit determines the line (or nth order polynomial curve rather...) of best fit by linear regression by minimizing the sum of squared errors between the best fit line and your data points. How you call the function is so:
coeff = polyfit(x,y,order);
x and y are the x and y points of your data while order determines the order of the line of best fit you want. As an example, order=1 means that the line is linear, order=2 means that the line is quadratic and so on. Essentially, polyfit fits a polynomial of order order given your data points. Given your problem, order=1. As such, given the data in the link, you would simply do:
X = 1:5;
Y = [1 2 1.3 3.75 2.25];
coeff = polyfit(X,Y,1)
coeff =
0.4250 0.7850
The way coeff works is that these are the coefficients of the regression line, starting from the highest order in decreasing value. As such, the above coeff variable means that the regression line was fitted as:
y = 0.4250*x + 0.7850
The first coefficient is the slope while the second coefficient is the intercept. You'll also see that this matches up with the link you provided.
If you want a visual representation, here's a plot of the data points as well as the regression line that best fits these points:
plot(X, Y, 'r.', X, polyval(coeff, X));
Here's the plot:
polyval takes an array of coefficients (usually produced by polyfit), and you provide a set of x co-ordinates and it calculates what the y values are given the values of x. Essentially, you are evaluating what the points are along the best fit line.
Edit - Extending to higher orders
If you want to extend so that you're finding the best fit for any nth order polynomial, I won't go into the details, but it boils down to constructing the following linear system. Given the relationship for the ith point between (x_i, y_i):
You would construct the following linear system:
Basically, you would create a vector of points y, and you would construct a matrix X such that each column denotes taking your vector of points x and applying a power operation to each column. Specifically, the first column is the zero-th power, the first column is the first power, the second column is the second power and so on. You would do this up until m, which is the order polynomial you want. The vector of e would be the residual error for each point in your set.
Specifically, the formulation of the problem can be written in matrix form as:
Once you construct this matrix, you would find the parameters by least-squares by calculating the pseudo-inverse. How the pseudo-inverse is derived, you can read it up on the Wikipedia article I linked to, but this is the basis for minimizing a system by least-squares. The pseudo-inverse is the backbone behind least-squares minimization. Specifically:
(X^{T}*X)^{-1}*X^{T} is the pseudo-inverse. X itself is a very popular matrix, which is known as the Vandermonde matrix and MATLAB has a command called vander to help you compute that matrix. A small note is that vander in MATLAB is returned in reverse order. The powers decrease from m-1 down to 0. If you want to have this reversed, you'd need to call fliplr on that output matrix. Also, you will need to append one more column at the end of it, which is the vector with all of its elements raised to the mth power.
I won't go into how you'd repeat your example for anything higher order than linear. I'm going to leave that to you as a learning exercise, but simply construct the vector y, the matrix X with vander, then find the parameters by applying the pseudo-inverse of X with the above to solve for your parameters.
Good luck!