Eigen Values from Matlab - matlab

I'm trying to figure out Eigenvalues/Eigenvectors for large datasets in order to compute
the PCA. I can calculate the Eigenvalues and Eigenvectors for 2x2, 3x3 etc..
The problem is, I have a dataset containing 451x128 I compute the covariance matrix which
gives me 128x128 values from this. This, therefore looks like the following:
A = [ [1, 2, 3,
2, 3, 1,
..........,
= 128]
[5, 4, 1,
3, 2, 1,
2, 1, 2,
..........
= 128]
.......,
128]
Computing the Eigenvalues and vectors for a 128x128 vector seems really difficult and
would take a lot of computing power. However, if I allow for each of the blocks in A to be a 2-dimensional (3xN) I can then compute the covariance matrix which will give me a 3x3 matrix.
My question is this: Would this be a good or reasonable assumption for solving the eigenvalues and vectors? Something like this:
A is a 2-dimensional vector containing 128x451,
foreach of the blocks compute the eigenvalues and eigenvectors of the covariance vector,
like so:
Eig1 = eig(cov(A[0]))
Eig2 = eig(cov(A[1]))
This would then give me 128 Eigenvalues (for each of the blocks inside the 128x128 vector)..
If this is not correct, how does MATLAB handle such large dimensional data?

Have you tried svd()
Do the singular value decomposition
[U,S,V] = svd(X)
U and V are orthogonal matrices and S contains the eigen values. Sort U and V in descending order based on S.

As kkuilla mentions, you can use the SVD of the original matrix, as the SVD of a matrix is related to the Eigenvalues and Eigenvectors of the covariance matrix as I demonstrate in the following example:
A = [1 2 3; 6 5 4]; % A rectangular matrix
X = A*A'; % The covariance matrix of A
[V, D] = eig(X); % Get the eigenvectors and eigenvalues of the covariance matrix
[U,S,W] = svd(A); % Get the singular values of the original matrix
V is a matrix containing the eigenvectors, and D contains the eigenvalues. Now, the relationship:
SST ~ D
U ~ V
As to your own assumption, I may be misreading it, but I think it is false. I can't see why the Eigenvalues of the blocks would relate to the Eigenvalues of the matrix as a whole; they wouldn't correspond to the same Eigenvectors, as the dimensionality of the Eigenvectors wouldn't match. I think your covariances would be different too, but then I'm not completely clear on how you are creating these blocks.
As to how Matlab does it, it does use some tricks. Perhaps the link below might be informative (though it might be a little old). I believe they use (or used) LAPACK and a QZ factorisation to obtain intermediate values.
https://au.mathworks.com/company/newsletters/articles/matlab-incorporates-lapack.html

Use the word
[Eigenvectors, Eigenvalues] = eig(Matrix)

Related

covariance matrix is not positive definite

I have a feature vector(FV1) of size 1*n. Now I subtract mean of all feature vectors from the feature vector FV1 Now I take transpose of that(FV1_Transpose) which is n*1. Now I add do matrix multiplication (FV1_Transpose * FV1) to get covariance matrix which is n*n.
But my problem is that I dont get a positive definite matrix. I read everywhere that covariance matrix should be symmetric positive definite.
FV1 after subtraction of mean = -17.7926788,0.814089298,33.8878059,-17.8336430,22.4685001;
Covariance matrix =
316.579407, -14.4848289, -602.954834, 317.308289, -399.774811
-14.4848289, 0.662741363, 27.5876999, -14.5181780, 18.2913647
-602.954834, 27.5876999, 1148.38342, -604.343018, 761.408142
317.308289, -14.5181780, -604.343018, 318.038818, -400.695221
-399.774811, 18.2913647, 761.408142, -400.695221, 504.833496
This covariance matrix is not positive definite. Any ideawhy is it so?
Thanks in advance.
Are you sure the matrix is not positive definite? I did the following in octave.
A = [ 316.579407, -14.4848289, -602.954834, 317.308289, -399.774811 -14.4848289, 0.662741363, 27.5876999, -14.5181780, 18.2913647 -602.954834, 27.5876999, 1148.38342, -604.343018, 761.408142 317.308289, -14.5181780, -604.343018, 318.038818, -400.695221 -399.774811, 18.2913647, 761.408142, -400.695221, 504.833496]
A = reshape(A, 5, 5)
svd(A)
The eigen values of A as obtained from svd were.
2.2885e+03
5.4922e-05
1.5958e-05
1.3636e-05
1.1507e-08
Please note that all the eigen values are positive.
Now, A is symmetric (being a co-variance matrix), To verify,
A - A'
would give you a 5 x 5 zero matrix
A symmetric matrix which has positive eigen values should be positive definite.
reference

Recovering original matrix from Eigenvalue Decomposition

According to Wikipedia the eigenvalue decomposition should be such that:
http://en.wikipedia.org/wiki/Square_root_of_a_matrix
See section Computational Methods by diagonalization:
Sp that if matrix A is decomposed such that it has Eigenvector V and Eigenvalues D, then A=VDV'.
A=[1 2; 3 4];
[V,D]=eig(A);
RepA=V*D*V';
However in Matlab, A and RepA are not equal?
Why is this?
Baz
In general, the formula is:
RepA = V*D*inv(V);
or, written for better numeric accuracy in MATLAB,
RepA = V*D/V;
When A is symmetric, then the V matrix will turn out to be orthogonal, which will make inv(V) = V.'. A is NOT symmetric, so you need the actual inverse.
Try it:
A=[1 2; 2 3]; % Symmetric
[V,D]=eig(A);
RepA = V*D*V';

Matlab ordfilt2 or alternatives for weighted local max

I would like to compute the weighted maxima of a vector in Matlab. For weighted maxima I intend the following:
Given a vector of 2*N+1 weights W={w[-N], w[-N+1] .. w[0] .. w[N]} and given an input sequence A, weighted maxima is a vector M where m[i]=max(w[-N]*a[i-N], w[-N+1]*a[i-N+1], ... w[N]*a[i+N])
So for example given a vector A= [1, 4, 12, 2, 4] and weights W=[0.5, 1, 0.5], the weighted maxima would be M=[2, 6, 12, 6, 4].
This can be done using ordfilt2, but ordfilt2 uses weights as additive rather then multiplicative.
I am actually working on 4D matrixes, but any 1D solution would work as the 4D weight matrix is separable.
My current solution is to generate shifted copies of the input array A, weight them according to the shift and maximize all the arrays. Shift is performed using circshift and is the bottleneck in the process. generating shifted matrixes "manually" trough indexing turned out to be even slower.
Can you suggest any more efficient solution?
EDIT: For a positive A, M=exp(ordfilt2(log(A), length(W), ones(size(W)), log(W))) does the job, but still takes longer than the circshift solution above. I am still looking for more efficient solutions.
>> B = padarray(A, [0 floor(numel(W)/2)], 0); % Pad A with zeros
>> B = bsxfun(#times, B(bsxfun(#plus, 1:numel(B)-numel(W)+1, (0:numel(W)-1)')), W(:)); % Apply the weights
>> M = max(B) % Compute the local maxima
M =
2 6 12 6 4

How to normalize a matrix of 3-D vectors

I have a 512x512x3 matrix that stores 512x512 there-dimensional vectors. What is the best way to normalize all those vectors, so that my result are 512x512 vectors with length that equals 1?
At the moment I use for loops, but I don't think that is the best way in MATLAB.
If the vectors are Euclidean, the length of each is the square root of the sum of the squares of its coordinates. To normalize each vector individually so that it has unit length, you need to divide its coordinates by its norm. For that purpose you can use bsxfun:
norm_A = sqrt(sum(A .^ 2, 3)_; %// Calculate Euclidean length
norm_A(norm_A < eps) == 1; %// Avoid division by zero
B = bsxfun(#rdivide, A, norm_A); %// Normalize
where A is your original 3-D vector matrix.
EDIT: Following Shai's comment, added a fix to avoid possible division by zero for null vectors.

Comparing two sets of vectors

I've got matrices A and B
size(A) = [n x]; size(B) = [n y];
Now I need to compare euclidian distance of each column vector of A from each column vector of B. I'm using dist method right now
Q = dist([A B]); Q = Q(1:x, x:end);
But it does also lot of needless work (like calculating distances between vectors of A and B separately).
What is the best way to calculate this?
You are looking for pdist2.
% Compute the ordinary Euclidean distance
D = pdist2(A.',B.','euclidean'); % euclidean distance
You should take the transpose of the matrices since pdist2 assumes the observations are in rows, not in columns.
An alternative solution to pdist2, if you don't have the Statistics Toolbox, is to compute this manually. For example, one way to do it is:
[X, Y] = meshgrid(1:size(A, 2), 1:size(B, 2)); %// or meshgrid(1:x, 1:y)
Q = sqrt(sum((A(:, X(:)) - B(:, Y(:))) .^ 2, 1));
The indices of the columns from A and B for each value in vector Q can be obtained by computing:
[X(:), Y(:)]
where each row contains a pair of indices: the first is the column index in matrix A, and the second is the column index in matrix B.
Another solution if you don't have pdist2 and which may also be faster for very large matrices is to vectorize the following mathematical fact:
||x-y||^2 = ||x||^2 + ||y||^2 - 2*dot(x,y)
where ||a|| is the L2-norm (euclidean norm) of a.
Comments:
C=-2*A'*B (this is a x by y matrix) is the vectorization of the dot products.
||x-y||^2 is the square of the euclidean distance which you are looking for.
Is that enough or do you need the explicit code?
The reason this may be faster asymptotically is that you avoid doing the metric calculation for all x*y comparisons, since you are instead making the bottleneck a matrix multiplication (matrix multiplication is highly optimized in matlab). You are taking advantage of the fact that this is the euclidean distance and not just some unknown metric.