Linear regression coefficients for multiple linear equations - matlab

I have multiple linear equations in the form of Zi=ai*Xi+bi*Yi for i = 1..30.
How can I calculate every pair of regression coefficient values, or those 30 values of a and b for each (Z,X,Y) combination using MATLAB?
I've tried the following code:
A=Z; B=[Xs Ys];
C = B \ A;
A are my Z points while B is a matrix of my X and Y points. However, I seem to only get one pair of regression coefficients for all of the points.
Thanks in advance!

What you have set up there is unfortunately not the right way to solve it if I understand your problem formulation. That assumption assumes that you are trying to fit all of the points on a single line. Each row of B would thus serve as one point on only one line that you are trying to find the linear regression of. If you want to solve for multiple lines simultaneously, you are going to need to change your formulation.
That is actually very simple. I'm going to assume that you have 30 (x,y) points where each point denotes one equation of a line. You have these set as Xs and Ys respectively. The outputs for each of these equations is also in Zs. I'm also going to assume these are column vectors, and therefore, you have a system set up such that:
a_i and b_i are the coefficients for each line. You know the (x,y) for each line and your goal is to solve for each corresponding a and b. As such, you would need to reformulate your system so that you're solve for a and b.
Rewriting that problem in matrix form, it can be done like so:
The right hand side vector of a_1, b_1, a_2, b_2, ... is what you are ultimately solving for. You can see that we have a matrix equation of Y = M*X where M and Y are known and X is what we need to solve for by doing X = M\Y. As such, all you need to rearrange your x and y values into a block matrix like the above. First we need to find the correct linear indices so that we can place our x and y values into this matrix, then solve the system by least squares with the ldivide operator. The matrix is a N x 2N matrix where N is the total number of equations or constraints that we have (so in your case, 30):
N = numel(Xs);
M = zeros(N, 2*N);
xind = sub2ind(size(M), 1:N, 1:2:2*N);
yind = sub2ind(size(M), 1:N, 2:2:2*N);
M(xind) = Xs;
M(yind) = Ys;
sub2ind allows you to place multiple values into a matrix with a single line of code. Specifically, sub2ind determine the linear indices from a set of row and column coordinates to access into a matrix. If you don't already know, you can access values (and set values) in a matrix using a single number instead of a pair of row and columns. sub2ind will allow you to set multiple values in a matrix at once by specifying a set of linear indices to access in the matrix with a corresponding vector.
In our case, we need two sets of linear indices - one for the x values and one for the y values. Note that the x values start from the first column and skip every other column. The same behaviour can be said for the y values but we start at the second column. Once we have those indices, we set the x and y values in this matrix and we now we simply solve for the coefficients:
coeff = M \ Z;
coeff will now be 2N x 1 vector, so if you want, you can reshape this into a matrix:
coeff = reshape(coeff, 2, []);
Now, coeff will be shaped such that each column will give you the pair of a,b for each equation that you had. As such, the first column denotes a_1, b_1, the second column denotes a_2, b_2 and so on. The first row of coeff is all of the a coefficients for each constraint while the second row is all of the b coefficients for each constraint.

Related

Determine the 'greatest' singular vector of U matrix after SVD in Matlab

It is known that in Matlab SVD function outputs three matrices: [U,S,V] = svd(X).
Actually, 'U' is a square m X m matrix where m is the number of rows/columns. Also, 'S' is a non-square matrix with dimensions m X n that stores n singular values (produced from left singular vectors of U matrix) in descending order(in diagonal).
My question is how to determine (in Matlab) which 'm' singular vectors of matrix 'U' correspond to the first (greatest) singular value of the 'S' matrix. Furthermore, some values of the specific singular vector are positive and others are negative. Does this minus or plus sign hides any mathematical meaning? I have seen examples that use the sign of the 'greatest' singular vector as for classification purposes.
The diagonal of the S matrix contains the singular values. So for the ith singular value (in the i,i position on S matrix), ith column of the U and V vectors respectively for the two constraint equations.
I don't think the +/- hides any special meaning. After all, you could multiply both the U and the V matrices by a -1 constant and the result would still be valid.
To be perfectly accurate, by definition singular values of SVD are not necessarly reordered, but MATLAB SVD reorders them.
The ith column of U corresponds to the ith singular value of M.
Namely for the ith singular value sigma_j, there exist j such that
M* .u_i = sigma_j v_j
you also have
M. v_j = sigma_i u_i
Be careful, it might not be what you are looking for.
The coordinates of your singular values are the coordonates in the original basis. A positive values means your new variable is positively proportional to the corresponding original variable. In statistics it is generally used when you know that both original and transformed variables increase or decrease together.

Matlab Multiply A Matrix By Individual Sections of Another Matrix And Get the Diagonal Elements

The title of this post may be a bit confusing. Please allow me to provide a bit of context and then elaborate on what I'm asking. For your reference, the question I'm asking is toward the end and is denoted by bold letters. I provide some code, outlining where I'm currently at in solving the problem, immediately beforehand.
Essentially what I'm trying to do is Kernel Regression, which is usually done using a single test point x and a set of training instances . A reference to this can be found on wikipedia here. The kernel I'm using is the RBF kernel, a Wikipedia reference for which can be found here.
Anyway, I have some code written in Matlab so that this can be done quickly for a single instance of x, which is 1 x p in size. What I'd like to do is make it so I can estimate for numerous points very quickly, say m x p.
For the sake of avoiding notational mixups, I'll let the training instances be denoted Train and the instances I want estimates for as Test: and . It also needs to be mentioned that I want to estimate a vector of numbers for each of the m points. For a single point this vector would be 1 x v in size. Now I need it to be m x v. Therefore, Train will also have a vector of these know values associated with it called TS: . Lastly, we need a vector of sigmas that is 1 x v in size. This is denoted as Sig.
Here's the code I have so far:
%First, we have to get the matrices to equivalent size so we can subtract Train from Test
tm0 = kron(ones(size(Train,1),1),Test) - kron(ones(size(Test,1),1),Train);
%Secondly, we apply the Euclidean norm sq by row and then multiply each of these results by each element (j) in Sig times 1/2j^2
tm3 = exp(-kron(sum((tm0).^2,2),1/2./(Sig.^2)));
Now, at this point tm3 is an (m*n) x v matrix. This is where my question is: I now need to multiply TS' (TS transpose) times each of the n x v-sized segments in tm3 (there are m of these segments), get the diagonal elements of each of these resulting segments (after multiplication one of the m segments will be v x v, so each chunk of diagonal elements will be 1 x v meaning the resulting matrix is m x v) and sum these diagonal elements together to produce an m x 1 sized matrix. Lastly, I will need to divide each entry i in this m x 1 matrix by each of the v elements in the ith row of the diagonal-holding m x v-sized matrix, producing an m x v-sized result matrix.
I hope all of that makes sense. I'm sure there's some kind of trick that can be employed, but I'm just not coming up with it. Any help is greatly appreciated.
Edit 1: I was asked to provide more of an example to help demonstrate what it is that I would like done. The following represent that two matrices I'm talking about, TS and tm3:
As you can see, TS'(TS transpose) is v x n and tm3 is mn x v. In tm3 there are blocks that are of size n x v -- there are m blocks of this size. Notice that the size of TS' is of size v x n. This means that I can multiply TS' by a single block of tm3, which again is of size n x v. This would result in a matrix that is v x v in size. I would like to do this operation -- individually multiplying TS' by each of the n x v-sized blocks of tm3, which would produce m v x v matrices.
From here, though, I would like to obtain the diagonal elements from each of these v x v matrices. So, for a single v x v matrix, denoted using a:
Ultimately, I would to do this for each of the m v x v matrices giving me something that looks like the following, where s is the mth v x v matrix:
If I denote this last matrix as Q, which is m x v in size, it is trivial to sum the elements across the rows to produce the m x 1 vector I was looking for. I will refer to this vector as C. However, I would then like to divide each of these m scalar values by the corresponding row of matrix Q, to produce another m x v matrix:
This is the final matrix I'm looking for. Hopefully this helps make it clear what I'm looking for. Thanks for taking the time to read this!
Thought: I'm pretty sure I could accomplish this by converting tm3 to a cell matrix by doing tc1 = mat2cell(tm3,repmat(length(Train),1,m),length(Sig)), and then put replicate TS m times in another cell matrix tc2 = mat2cell(TS',length(indirectSigma),repmat(length(Train),1,m))'. Finally, I could do operations like tc3 = cellfun(#(a,b) a*b, tc2,tc1,'UniformOutput',false), which would give me m cells filled with the v x v matrices I was looking for. I could proceed from there. However, I'm not sure how fast these cell operations are. Can anybody comment? I'm afraid they might be slow, so I would prefer operations be performed on normal matrices, which I know to be fast. Thanks!

Difference in between Covariance and Correlation Matrix

In Matlab, I have created a matrix A with size (244x2014723)
and a matrix B with size (244x1)
I was able to calculate the correlation matrix using corr(A,B) which yielded in a matrix of size 2014723x1. So, every column of matrix A correlates with matrix B and gives one row value in the matrix of size 2014723x1.
My question is when I ask for a covariance matrix using cov(A,B), I get an error saying A and B should be of same sizes. Why do I get this error? How is the method to find corr(A,B) any different from cov(A,B)?
The answer is pretty clear if you read the documentation:
cov:
If A and B are matrices of observations, cov(A,B) treats A and B as vectors and is equivalent to cov(A(:),B(:)). A and B must have equal size.
corr
corr(X,Y) returns a p1-by-p2 matrix containing the pairwise correlation coefficient between each pair of columns in the n-by-p1 and n-by-p2 matrices X and Y.
The difference between corr(X,Y) and the MATLABĀ® function corrcoef(X,Y) is that corrcoef(X,Y) returns a matrix of correlation coefficients for the two column vectors X and Y. If X and Y are not column vectors, corrcoef(X,Y) converts them to column vectors.
One way you could get the covariances of your vector with each column of you matrix is to use a loop. Another way (might be in-efficient depending on the size) is
C = cov([B,A])
and then look at the first row (or column) or C.
See link
In the more about section, the equation describing how cov is computed for cov(A,B) makes it clear why they need to be the same size. The summation is over only one variable which enumerates the elements of A,B.

Principal Components calculated using different functions in Matlab

I am trying to understand principal component analysis in Matlab,
There seems to be at least 3 different functions that do it.
I have some questions re the code below:
Am I creating approximate x values using only one eigenvector (the one corresponding to the largest eigenvalue) correctly? I think so??
Why are PC and V which are both meant to be the loadings for (x'x) presented differently? The column order is reversed because eig does not order the eigenvalues with the largest value first but why are they the negative of each other?
Why are the eig values not in ordered with the eigenvector corresponding to the largest eigenvalue in the first column?
Using the code below I get back to the input matrix x when using svd and eig, but the results from princomp seem to be totally different? What so I have to do to make princomp match the other two functions?
Code:
x=[1 2;3 4;5 6;7 8 ]
econFlag=0;
[U,sigma,V] = svd(x,econFlag);%[U,sigma,coeff] = svd(z,econFlag);
U1=U(:,1);
V1=V(:,1);
sigma_partial=sigma(1,1);
score1=U*sigma;
test1=score1*V';
score_partial=U1*sigma_partial;
test1_partial=score_partial*V1';
[PC, D] = eig(x'*x)
score2=x*PC;
test2=score2*PC';
PC1=PC(:,2);
score2_partial=x*PC1;
test2_partial=score2_partial*PC1';
[o1 o2 o3]=princomp(x);
Yes. According to the documentation of svd, diagonal elements of the output S are in decreasing order. There is no such guarantee for the the output D of eig though.
Eigenvectors and singular vectors have no defined sign. If a is an eigenvector, so is -a.
I've often wondered the same. Laziness on the part of TMW? Optimization, because sorting would be an additional step and not everybody needs 'em sorted?
princomp centers the input data before computing the principal components. This makes sense as normally the PCA is computed with respect to the covariance matrix, and the eigenvectors of x' * x are only identical to those of the covariance matrix if x is mean-free.
I would compute the PCA by transforming to the basis of the eigenvectors of the covariance matrix (centered data), but apply this transform to the original (uncentered) data. This allows to capture a maximum of variance with as few principal components as possible, but still to recover the orginal data from all of them:
[V, D] = eig(cov(x));
score = x * V;
test = score * V';
test is identical to x, up to numerical error.
In order to easily pick the components with the most variance, let's fix that lack of sorting ourselves:
[V, D] = eig(cov(x));
[D, ind] = sort(diag(D), 'descend');
V = V(:, ind);
score = x * V;
test = score * V';
Reconstruct the signal using the strongest principal component only:
test_partial = score(:, 1) * V(:, 1)';
In response to Amro's comments: It is of course also possible to first remove the means from the input data, and transform these "centered" data. In that case, for perfect reconstruction of the original data it would be necessary to add the means again. The way to compute the PCA given above is the one described by Neil H. Timm, Applied Multivariate Analysis, Springer 2002, page 446:
Given an observation vector Y with mean mu and covariance matrix Sigma of full rank p, the goal of PCA is to create a new set of variables called principal components (PCs) or principal variates. The principal components are linear combinations of the variables of the vector Y that are uncorrelated such that the variance of the jth component is maximal.
Timm later defines "standardized components" as those which have been computed from centered data and are then divided by the square root of the eigenvalues (i.e. variances), i.e. "standardized principal components" have mean 0 and variance 1.

Is it possible to reverse svds

Is it possible to reverse the following in matlab:
[U,S,V]=svds(fulldata,columns);
Quoting MathWorks:
[U,S,V] = svd(X) produces a diagonal matrix S of the same dimension as X, with nonnegative diagonal elements in decreasing order, and unitary matrices U and V so that X = U*S*V'.
In the case of svds, one will lose some information unless columns is equal to the size of the square matrix fulldata. In this case, I believe the original matrix cannot be reconstructed uniquely.