Difference in between Covariance and Correlation Matrix - matlab

In Matlab, I have created a matrix A with size (244x2014723)
and a matrix B with size (244x1)
I was able to calculate the correlation matrix using corr(A,B) which yielded in a matrix of size 2014723x1. So, every column of matrix A correlates with matrix B and gives one row value in the matrix of size 2014723x1.
My question is when I ask for a covariance matrix using cov(A,B), I get an error saying A and B should be of same sizes. Why do I get this error? How is the method to find corr(A,B) any different from cov(A,B)?

The answer is pretty clear if you read the documentation:
cov:
If A and B are matrices of observations, cov(A,B) treats A and B as vectors and is equivalent to cov(A(:),B(:)). A and B must have equal size.
corr
corr(X,Y) returns a p1-by-p2 matrix containing the pairwise correlation coefficient between each pair of columns in the n-by-p1 and n-by-p2 matrices X and Y.
The difference between corr(X,Y) and the MATLABĀ® function corrcoef(X,Y) is that corrcoef(X,Y) returns a matrix of correlation coefficients for the two column vectors X and Y. If X and Y are not column vectors, corrcoef(X,Y) converts them to column vectors.
One way you could get the covariances of your vector with each column of you matrix is to use a loop. Another way (might be in-efficient depending on the size) is
C = cov([B,A])
and then look at the first row (or column) or C.

See link
In the more about section, the equation describing how cov is computed for cov(A,B) makes it clear why they need to be the same size. The summation is over only one variable which enumerates the elements of A,B.

Related

Determine the 'greatest' singular vector of U matrix after SVD in Matlab

It is known that in Matlab SVD function outputs three matrices: [U,S,V] = svd(X).
Actually, 'U' is a square m X m matrix where m is the number of rows/columns. Also, 'S' is a non-square matrix with dimensions m X n that stores n singular values (produced from left singular vectors of U matrix) in descending order(in diagonal).
My question is how to determine (in Matlab) which 'm' singular vectors of matrix 'U' correspond to the first (greatest) singular value of the 'S' matrix. Furthermore, some values of the specific singular vector are positive and others are negative. Does this minus or plus sign hides any mathematical meaning? I have seen examples that use the sign of the 'greatest' singular vector as for classification purposes.
The diagonal of the S matrix contains the singular values. So for the ith singular value (in the i,i position on S matrix), ith column of the U and V vectors respectively for the two constraint equations.
I don't think the +/- hides any special meaning. After all, you could multiply both the U and the V matrices by a -1 constant and the result would still be valid.
To be perfectly accurate, by definition singular values of SVD are not necessarly reordered, but MATLAB SVD reorders them.
The ith column of U corresponds to the ith singular value of M.
Namely for the ith singular value sigma_j, there exist j such that
M* .u_i = sigma_j v_j
you also have
M. v_j = sigma_i u_i
Be careful, it might not be what you are looking for.
The coordinates of your singular values are the coordonates in the original basis. A positive values means your new variable is positively proportional to the corresponding original variable. In statistics it is generally used when you know that both original and transformed variables increase or decrease together.

Linear regression coefficients for multiple linear equations

I have multiple linear equations in the form of Zi=ai*Xi+bi*Yi for i = 1..30.
How can I calculate every pair of regression coefficient values, or those 30 values of a and b for each (Z,X,Y) combination using MATLAB?
I've tried the following code:
A=Z; B=[Xs Ys];
C = B \ A;
A are my Z points while B is a matrix of my X and Y points. However, I seem to only get one pair of regression coefficients for all of the points.
Thanks in advance!
What you have set up there is unfortunately not the right way to solve it if I understand your problem formulation. That assumption assumes that you are trying to fit all of the points on a single line. Each row of B would thus serve as one point on only one line that you are trying to find the linear regression of. If you want to solve for multiple lines simultaneously, you are going to need to change your formulation.
That is actually very simple. I'm going to assume that you have 30 (x,y) points where each point denotes one equation of a line. You have these set as Xs and Ys respectively. The outputs for each of these equations is also in Zs. I'm also going to assume these are column vectors, and therefore, you have a system set up such that:
a_i and b_i are the coefficients for each line. You know the (x,y) for each line and your goal is to solve for each corresponding a and b. As such, you would need to reformulate your system so that you're solve for a and b.
Rewriting that problem in matrix form, it can be done like so:
The right hand side vector of a_1, b_1, a_2, b_2, ... is what you are ultimately solving for. You can see that we have a matrix equation of Y = M*X where M and Y are known and X is what we need to solve for by doing X = M\Y. As such, all you need to rearrange your x and y values into a block matrix like the above. First we need to find the correct linear indices so that we can place our x and y values into this matrix, then solve the system by least squares with the ldivide operator. The matrix is a N x 2N matrix where N is the total number of equations or constraints that we have (so in your case, 30):
N = numel(Xs);
M = zeros(N, 2*N);
xind = sub2ind(size(M), 1:N, 1:2:2*N);
yind = sub2ind(size(M), 1:N, 2:2:2*N);
M(xind) = Xs;
M(yind) = Ys;
sub2ind allows you to place multiple values into a matrix with a single line of code. Specifically, sub2ind determine the linear indices from a set of row and column coordinates to access into a matrix. If you don't already know, you can access values (and set values) in a matrix using a single number instead of a pair of row and columns. sub2ind will allow you to set multiple values in a matrix at once by specifying a set of linear indices to access in the matrix with a corresponding vector.
In our case, we need two sets of linear indices - one for the x values and one for the y values. Note that the x values start from the first column and skip every other column. The same behaviour can be said for the y values but we start at the second column. Once we have those indices, we set the x and y values in this matrix and we now we simply solve for the coefficients:
coeff = M \ Z;
coeff will now be 2N x 1 vector, so if you want, you can reshape this into a matrix:
coeff = reshape(coeff, 2, []);
Now, coeff will be shaped such that each column will give you the pair of a,b for each equation that you had. As such, the first column denotes a_1, b_1, the second column denotes a_2, b_2 and so on. The first row of coeff is all of the a coefficients for each constraint while the second row is all of the b coefficients for each constraint.

it is possible determinant of matrix(256*256) be infinite

i have (256*1) vectors of feature come from (16*16) of gray images. number of vectors is 550
when i compute Sample covariance of this vectors and compute covariance matrix determinant
answer is inf
it is possible determinant of finite matrix with finite range (0:255) value be infinite or i mistake some where?
in fact i want classification with bayesian estimation , my distribution is gaussian and when
i compute determinant be inf and ultimate Answer(likelihood) is zero .
some part of my code:
Mean = mean(dataSet,2);
MeanMatrix = Mean*ones(1,NoC);
Xc = double(dataSet)-MeanMatrix; % transform data to the origine
Sigma = (1/NoC) *Xc*Xc'; % calculate sample covariance matrix
Parameters(i).M = Mean';
Parameters(i).C = Sigma;
likelihoods(i) = (1/(2*pi*sqrt(det(params(i).C)))) * (exp(-0.5 * (double(X)-params(i).M)' * inv(params(i).C) * (double(X)-params(i).M)));
variable i show my classes;
variable X show my feature vector;
Can the determinant of such matrix be infinite? No it cannot.
Can it evaluate as infinite? Yes definitely.
Here is an example of a matrix with a finite amount of elements, that are not too big, yet the determinant will rarely evaluate as a finite number:
det(rand(255)*255)
In your case, probably what is happening is that you have too few datapoints to produce a full-rank covariance matrix.
For instance, if you have N examples, each with dimension d, and N<d, then your d x d covariance matrix will not be full rank and will have a determinant of zero.
In this case, a matrix inverse (precision matrix) does not exist. However, attempting to compute the determinant of the inverse (by taking 1/|X'*X|=1/0 -> \infty) will produce an infinite value.
One way to get around this problem is to set the covariance to X'*X+eps*eye(d), where eps is a small value. This technique corresponds to placing a weak prior distribution on elements of X.
no it is not possible. it may be singular but taking elements a large value has will have a determinant value.

Is it possible to reverse svds

Is it possible to reverse the following in matlab:
[U,S,V]=svds(fulldata,columns);
Quoting MathWorks:
[U,S,V] = svd(X) produces a diagonal matrix S of the same dimension as X, with nonnegative diagonal elements in decreasing order, and unitary matrices U and V so that X = U*S*V'.
In the case of svds, one will lose some information unless columns is equal to the size of the square matrix fulldata. In this case, I believe the original matrix cannot be reconstructed uniquely.

Weighted sum of elements in matrix - Matlab?

I have two 50 x 6 matrices, say A and B. I want to assign weights to each element of columns in matrix - more weight to elements occurring earlier in a column and less weight to elements occurring later in the same column...likewise for all 6 columns. Something like this:
cumsum(weight(row)*(A(row,col)-B(row,col)); % cumsum is for cumulative sum of matrix
How can we do it efficiently without using loops?
If you have your weight vector w as a 50x1 vector, then you can rewrite your code as
cumsum(repmat(w,1,6).*(A-B))
BTW, I don't know why you have the cumsum operating on a scalar in a loop... it has no effect. I'm assuming that you meant that's what you wanted to do with the entire matrix. Calling cumsum on a matrix will operate along each column by default. If you need to operate along the rows, you should call it with the optional dimension argument as cumsum(x,2), where x is whatever matrix you have.