Octave/Matlab: PCA on sparse matrix: how to get only the most important eigenvectors? - matlab

I am using Octave and have a huge sparse matrix that I have to get the eigenvalues of. However, if I just use a function to get all eigenvalues and eigenvectors, the result will take up way too much space, since the input matrix is sparse for a reason.
How can I get only a limited number of the most important eigenvectors?

Use eigs instead of eig:
D = eigs(A,k);
This returns the k largest eigenvalues of the matrix A. According to this page, Octave does support eigs for sparse matrices. eigs uses different techniques than eig, is slower overall, and shouldn't generally be used except in the cases such as the one you describe.
Be sure to check out the options for the sigma argument in case you want the largest eigenvalues with respect to their real parts only, for example.
The Matlab documentation for eigs is here.

Related

scipy eigs finds complex eigenvectors, although they should be real

I have a 1047x1047 sparse matrix and am interested in its eigenvectors and eigenvalues. From the mathematical derivation I know that these must be real. scipy eigs finds complex ones though. Unfortunately this destroys everything. There is no possibility to create a minimal example, because for small matrices eigs also calculates real eigenvectors, i.e. complex eigenvectors with imaginary part zero.
Desperate thanks!

Are the largest eigenvectors sorted by absolute eigenvalue?

Say I have a real symmetric matrix, and I want to extract the k largest eigenvectors using eig.
I know I can use eigs instead but that's not the point of my question.
I read that they use different algorithms, and the documentation for eigs states explicitly "largest magnitude", which seems to imply that the eigenvalues would be sorted in absolute value, especially because apparently the sign of the eigenvector/values does not matter.
However I also read that ordering the eigenvectors should be done according to the ranking of the eigenvalues, with sort(diag(D)); no absolute value here (and no assumption about positiveness for the matrix).
I think that either the latter post is wrong, or Matlab's documentation for eigs is wrong or misleading when using the words "largest magnitude", is that right? Or are they both right and I misunderstood something?
To clarify then, the "largest" eigenvectors should be sorted according to the absolute eigenvalue, correct?
The latter question you reference is discussing eig, which uses direct methods intended for general, dense matrices; you are discussing eigs, which uses iterative methods intended for general, sparse matrices.
eig will return the eigenvalues in the order found by the direct method. In general, this means random ordering. For real symmetric matrices, the ordering from the direct method appears to generate the eigenvalues from most negative to most positive.
eigs will return k eigenvalues and vectors with the largest/smallest magnitude/real part/imaginary part (user-specified). However, as noted, "eigs does not always return sorted eigenvalues and eigenvectors. Use sort to explicitly sort the output eigenvalues and eigenvectors in cases where their order is important."
Typically, yes, largest (absolute) magnitude. Though sort can be used to permute the eigenvalues into any required order.

MATLAB: Eig algorithm and alternatives

I am simulating a physical system, where I need to calculate the eigenvalues and vectors of a very large (~10000x10000) matrix.
So far I have used the in-built Eig algorithm in MATLAB but it is very slow for large matrices. Is there other algorithms in MATLAB that would do a better job or can I somehow improve the performance of Eig? Specifically it turns out that I only need the first ~100 eigenvectors of the matrix starting from the smallest numerical eigenvalue. Is there a way to get the algorithm to calculate only the first N eigenvectors and eigenvalues to save computation time? Of course this would only work if the eigenvectors come out sorted but they seem to do so, because of the symmetry of the Matrix I am using.
Your matrix has mostly zeros, so you should make it a sparse matrix. You'll then be able to use EIGS to calculate a smaller number of eigenvalues and eigenvectors.
http://www.mathworks.com/help/matlab/ref/eigs.html

MATLAB: Get small eigenvalues from `eigs` in sorted order

For example, eigs(A,k,'sm') returns the k smallest magnitude eigenvalues. However, eigs does not take care of the sign. Edit: eigs(A,k,'sr')takes care of it.
Say A is 500 by 500 sparse matrix. Without getting all eigenvalues like in eig, how to get the smallest 3 eigenvalues (not magnitude) and the corresponding eigenvectors for eigs in a sorted way efficiently?
This can be done easily by getting all eigenvalues in eig by sorting but I cannot use eig for some reasons as it takes a long time and huge memory to convert to full matrix and compute all eigenvalues.
Edit: This can also be done by eigs(A,k,'sr') and do the sorting myself. But is there a faster method or option in eigs to do so?
It should not do that unless there is a syntax error or your matrix has all the eigenvalues with positive real part. This gives the correct negative signed smallest real part (I guess that's what you mean by small) eigenvalues on R2016a. Note that smallest eigs are complex conjugates and one pair is given by only its negative imaginary part.
A = sprand(100,100,0.5);
[V,D] = eigs(A,3,'sr')

Matlab - calculating max eigenvalue of a big sparse (A'*A) matrix

I have a big (400K*400K) sparse matrix and I need to calculate the largest eigenvalue of A'*A.
The problem is that Matlab can't even calculate A' due to memory problems.
I also tried [a,b,c] = find(A) and then transpose by creating a transpose sparse matrix, but although the find() works, the sprase creation doesn't.
Is there a nice solution for this? it can be either in a matlab function or in another technique to calculate the largest eigenvalue for this kind of multiplication.
Thanks.
If A is sparse, see this thread and some discussion in this documentation (basically do it part by part) for a way to transpose it etc.
But now you need to calculate B=A'*A. The question is, is it still sparse? assuming it is, there shouldn't be a problem to proceed using the previous technique mentioned in the link.
Then after you've obtained B=A'*A, use eigs
eigs(B,1)
to obtain the largest magnitude eigenvalue.