Get non normalized eigenvectors in scipy - scipy

Scipy and Numpy returns eigenvectors normalized. I am trying to use the vectors for a physical application and I need them to not be normalized.
For example
a = np.matrix('-3, 2; -1, 0')
W,V = spl.eig(a)
scipy returns eigenvalues (W) of [-2,-1] and the modal matrix (V) (eigenvalues as columns) [[ 0.89442719 0.70710678][ 0.4472136 0.70710678]]
I need the original modal matrix [[2 1][1 1]]

You should have a look at sympy. This package tries to solve this stuff by means of algebraic calculations instead of numeric ones (as numpy does).
import sympy as sp
sp.init_printing(use_unicode=True)
mat_a = sp.Matrix([[-3, 2], [-1, 0]])
mat_a.eigenvects()
Result is (eigenvalue, multiplicity, eigenvector):
[(-2, 1, [[2],[1]]), (-1, 1, [[1],[1]])]

According to various related threads (1) (2) (3), there is no such thing as a "non normalized" eigenvector.
Indeed, an eigenvector v corresponding to the eigenvalue l of the matrix A is defined by,
A*v = l*v
and can therefore be multiplied by any scalar and remain valid.
While depending on the algorithm, the computed eigenvector can have a norm different from 1, this does not hold any particular meaning (physical or otherwise), and should not be relied on. It is customary to return a normalized eigenvector in most numerical libraries (scipy, R, matlab, etc).

It is important to note that normalizing eigenvectors can also change the direction/sign of the vectors. This could have consequences for some applications and the programmer should double check to ensure the signs make sense.

Related

how does Matlab normalize generalized eigenvectors?

I know that the eigenvectors produced by eig(A) have 2-norm 1. But what about the vectors produced in the generalized eigenvalue problem eig(A,B)? A natural conjecture is that such a vector v should satisfy v'Bv=1. When B is the identity matrix, then v'Bv is exactly the square of the 2-norm. I ran the following test for various matrices A and B:
[p,d]=eig(A,B);
v=p(:,1);
v'*B*v
I always choose B to be diagonal. I noticed that v'Bv is not always 1. However, it is indeed 1 when A is symmetric. Does anyone know the rule for the way that Matlab normalizes the generalized eigenvectors? I can't find it in the document.
According to the documentation (emphasis mine):
The form and normalization of V depends on the combination of input arguments:
[...]
[V,D] = eig(A,B) and [V,D] = eig(A,B,algorithm) returns V as a matrix whose columns are the generalized right eigenvectors that satisfy A*V = B*V*D. The 2-norm of each eigenvector is not necessarily 1. In this case, D contains the generalized eigenvalues of the pair, (A,B), along the main diagonal.
When eig uses the 'chol' algorithm with symmetric (Hermitian) A and symmetric (Hermitian) positive definite B, it normalizes the eigenvectors in V so that the B-norm of each is 1.
This means that, unless you are using the 'chol' algorithm, V is not normalized.
If I get you correctly, you are looking for a way to generalize a vector then given a vector you can divide it by its norm to obtain a secondary vector whose norm is 1.
If you are looking for the mathematical background, then Eigendecomposition of a matrix contains a good introduction.

Eigenvalues are always 1

When I get the eigenvalues of the diagonal of a PCA transformed image, I always get 1, whatever the image. What's the reason behind this?
I used the following code.
coeff = pca(pmap);
disp(coeff);
[V,L]=eig (coeff'*coeff);
Lamda = diag(L);
disp(Lamda);
The coeff which pca outputs are already eigenvectors, which are all orthogonal. They are even orthonormal, since MATLAB normalises them. Relative weight is in the explained output parameter of pca.
So transpose(coeff)*coeff gives you the identity matrix, which just contains ones and the eigenvectors of the identity matrix are, obviously, all just 1 in a single dimension.
The reason is thus because that's how linear algebra works.

sign determination of singular vectors ind matlabs svd function

Does anybody know how the sign of the singular vectors resulting from Matlab's svd function is determined?
Let:
B = U*S*V'
be a valid svd decomposition of a real or complex 2-by-2 matrix B, then:
B = (U*c)*S *(V*c)'
is also valid, where c is a matrix that changes the sign of one or both singular vectors:
c = diag([1 -1]), diag([-1 1]) or diag([-1 -1]).
I want to know how Matlab's svd algorithm determines the sign of the singular vectors in U and V.
Matlab uses LAPACK's DGESVD implementation for singular value decomposition, which doesn't take into account direction of the resulting vectors. In applications, when SVD is performed, decomposed data is processed and then data is reconstructed back signs make no difference. They became only important, when decomposed data is being analyzed.
One might apply sign correction algorithm after performing SVD with Matlab. But I believe sign correction depends on actual meaning of the data.
In the paper you provided direction is chosen to be the same as the most of the data points. This won't work for data with symmetrical distribution as theoretical direction is zero and sample direction will be random resulting in high numerical instability.
If the goal is just to have numerical stability of the solution, then it would be enough to choose some vector and change all SVD vectors to lie in the same half-space with it.

Principal Components calculated using different functions in Matlab

I am trying to understand principal component analysis in Matlab,
There seems to be at least 3 different functions that do it.
I have some questions re the code below:
Am I creating approximate x values using only one eigenvector (the one corresponding to the largest eigenvalue) correctly? I think so??
Why are PC and V which are both meant to be the loadings for (x'x) presented differently? The column order is reversed because eig does not order the eigenvalues with the largest value first but why are they the negative of each other?
Why are the eig values not in ordered with the eigenvector corresponding to the largest eigenvalue in the first column?
Using the code below I get back to the input matrix x when using svd and eig, but the results from princomp seem to be totally different? What so I have to do to make princomp match the other two functions?
Code:
x=[1 2;3 4;5 6;7 8 ]
econFlag=0;
[U,sigma,V] = svd(x,econFlag);%[U,sigma,coeff] = svd(z,econFlag);
U1=U(:,1);
V1=V(:,1);
sigma_partial=sigma(1,1);
score1=U*sigma;
test1=score1*V';
score_partial=U1*sigma_partial;
test1_partial=score_partial*V1';
[PC, D] = eig(x'*x)
score2=x*PC;
test2=score2*PC';
PC1=PC(:,2);
score2_partial=x*PC1;
test2_partial=score2_partial*PC1';
[o1 o2 o3]=princomp(x);
Yes. According to the documentation of svd, diagonal elements of the output S are in decreasing order. There is no such guarantee for the the output D of eig though.
Eigenvectors and singular vectors have no defined sign. If a is an eigenvector, so is -a.
I've often wondered the same. Laziness on the part of TMW? Optimization, because sorting would be an additional step and not everybody needs 'em sorted?
princomp centers the input data before computing the principal components. This makes sense as normally the PCA is computed with respect to the covariance matrix, and the eigenvectors of x' * x are only identical to those of the covariance matrix if x is mean-free.
I would compute the PCA by transforming to the basis of the eigenvectors of the covariance matrix (centered data), but apply this transform to the original (uncentered) data. This allows to capture a maximum of variance with as few principal components as possible, but still to recover the orginal data from all of them:
[V, D] = eig(cov(x));
score = x * V;
test = score * V';
test is identical to x, up to numerical error.
In order to easily pick the components with the most variance, let's fix that lack of sorting ourselves:
[V, D] = eig(cov(x));
[D, ind] = sort(diag(D), 'descend');
V = V(:, ind);
score = x * V;
test = score * V';
Reconstruct the signal using the strongest principal component only:
test_partial = score(:, 1) * V(:, 1)';
In response to Amro's comments: It is of course also possible to first remove the means from the input data, and transform these "centered" data. In that case, for perfect reconstruction of the original data it would be necessary to add the means again. The way to compute the PCA given above is the one described by Neil H. Timm, Applied Multivariate Analysis, Springer 2002, page 446:
Given an observation vector Y with mean mu and covariance matrix Sigma of full rank p, the goal of PCA is to create a new set of variables called principal components (PCs) or principal variates. The principal components are linear combinations of the variables of the vector Y that are uncorrelated such that the variance of the jth component is maximal.
Timm later defines "standardized components" as those which have been computed from centered data and are then divided by the square root of the eigenvalues (i.e. variances), i.e. "standardized principal components" have mean 0 and variance 1.

scipy generalized eigenproblem with positive semidefinite

Hi, guys!!!
I want to compute generalized eigendecomposition of the form:
Lf = lambda Af
by using scipy.sparse.linalg.eigs function, but get this error:
/usr/local/lib/python2.7/dist-packages/scipy/linalg/decomp_lu.py:61: RuntimeWarning: Diagonal number 65 is exactly zero. Singular matrix.
RuntimeWarning)
** On entry to DLASCL parameter number 4 had an illegal value
I am passing three arguments, a diagonal matrix, a positive semi-definite (PSD) matrix and numeric value K (first K eigenvalues). Matlab's eigs function performs well using the same input parameters, but in SciPy as I have understood, in order to compute with PSD I need to specify sigma parameter as well.
So, my question is: is there a way to avoid setting sigma parameter, as it is in MatLab, or if not, how to pick up sigma value?
Looking forward to getting advices or hints...
Thank you in advance!
The error appears to mean that in your generalized eigenproblem
L x = lambda A x
the matrix A is not positive definite (check the eigs docstring -- in your case the matrix is probably singular). This is a requirement for ARPACK mode 2. However, you can try specifying sigma=0 to switch to ARPACK mode 3 (but note that the meaning of the which parameter is inverted in this case!).
Now, I'm not sure what Matlab does, but a possibility is that it's calculating the pseudoinverse rather than the inverse of A. To emulate this, do
from scipy.sparse.linalg import LinearOperator
from scipy.linalg import lstsq
Ainv = LinearOperator(matvec=lambda x: lstsq(A, x)[0], shape=A.shape)
w, v = eigs(L, M=A, Minv=Ainv)
Check the results --- I don't know what will happen in this case.
Alternatively, you may try to specify a nonzero sigma. What you should select depends on the matrices involved. It affects the eigenvalues that are picked --- for instance with which='LM' are those for which lambda' = 1/(lambda - sigma) is large. Otherwise, it can probably be chosen arbitrarily, of course it's probably better for the Krylov progress if the transformed eigenvalues lambda' which you are interested in become well separated from the other eigenvalues.