I have 50x49 matrix A that has 49 linearly independent columns. However, my software (octave) tells me its rank is 44:
Is it due to some computational error? If so, then how to prevent such errors?
If the software was able to correctly calculate rref(A), then why did it fail with rank(A)? Does it mean that calculating rank(A) is more error prone than calculating rref(A), or vice versa? I mean rref(A) actually tells you the rank, but here's a contradiction.
P.S. I've checked, Python makes the same error.
EDIT 1: Here is the matrix A itself. The first 9 columns were given. The rest was obtained with polynomial features.
EDIT 2: I was able to found a similar issue. Here is 10x10 matrix B of rank 10 (and octave calculates its rank correctly). However, octave says that rank(B * B) = 9 which is impossible.
The distinction between an invertible matrix (i.e. full rank) and a non-invertible one is clear-cut in theory, but not so in practice. A matrix B with large condition number (as in your example) can be inverted, but computing the inverse is numerically unstable. It roughly corresponds to B having a determinant that is "small" (using an appropriate, relative measure of "small"), so the matrix is almost singular. As a result, the inverse matrix will be computed with bad accuracy. In your example B, the condition number (computed with cond) is 2.069e9.
Another way to look at this is: when the condition number is large, it well could be that B is "really" singular, but small numerical errors from previous computations make it look barely non-singular. So you can't be sure.
The rank and rref functions use different algorithms (singular-value decomposition for rank, Gauss-Jordan elimination with partial pivoting for rref). For well-behaved matrices the numerical errors will be small in both cases, and the results will be consistent. But for a bad-conditioned matrix the numerical errors will be large and potentially different in each case, giving inconsistent results.
This is a well known issue with numerical algebra. In general, avoid inverting matrices with large condition number.
Related
I am wondering if there is technical or theoretical reason on why Matlab on rank function considers as zero the value max(size(A))*eps(norm(A)). Can you please provide some intuition?
Thank you!
The following answer is not based on proper mathematical reasoning, it is just some speculations (as you were asking for intuition):
norm(A) is the order of magnitude of the matrix entries.
eps(norm(A)) is thus the accuracy that the floating point representation of the matrix entries typically has.
Now, consider you add N numbers that should theoretically add up to zero, but each of them has an error of eps to it ... I think we would expect an error in the order of sqrt(N) * eps for the result.
Then, given that the algorithm that computes the rank performs N^2 operations on the matrix entries (where N is its size) to result in a number that is checked against zero, the error that we would then expect is what you stated in your question.
What I don't know, is the algorithm that Matlab uses really of complexity N^2?
I use eigs to calculate the eigen vectors of sparse square matrices which are large (tens of thousands).
What I want is the smallest set of eigen vectors.
But
eigs(A, 10, 'sm') % Note: A is the matrix
runs very slow.
However, using eigs(A, 10, 'lm') gives me the answer relatively faster.
And as I tried, replacing 10 with A_width in eigs(A, 10, 'lm') so that this includes all the eigen vectors, doesn't solve this problem, 'cause this make it the as slow as using 'sm'.
So, I want to know why calculating the smallest vectors(using 'sm') is much slower than calculating the largest?
BTW, if you have any idea about how to use eigs with 'sm' as fast as with 'lm', please tell me that.
The algorithm used in pretty much any standard eigs function is (some variation of) the Lanczos algorithm. It is iterative and the first iterations give you the largest eigenvalues. This explains pretty much every observation you make:
Largest eigenvalues take the least amount of iterations,
Smallest eigenvalues take the maximum amount of iterations,
All eigenvalues also take the maximum amount of iterations.
There are tricks to "fool" eigs into calculating the smallest eigenvalues by actually making them the largest eigenvalues of another problem. This is usually accomplished by a shift parameter. Skimming over the Matlab documentation for eigs, I see that they have a sigma parameter, which might help you. Note the same documentation recommends proper eig if the matrix fits into memory, as eigs has its numerical quirks.
Since eigs is actually an m-file function, we can profile it. I have run a couple of basic tests, and it depends very much on the nature of the data in the matrix. If we run the profiler separately on the following two lines of code:
eigs(eye(1000), 10, 'lm'), and
eigs(eye(1000), 10, 'sm'),
then in the first instance it calls arpackc (the main function that does the work - according to the comments in eigs it's probably from here) a total of 22 times. In the second instance it is called 103 times.
On the other hand, trying it with
eigs(rand(1000), 10, 'lm'), and
eigs(rand(1000), 10, 'sm'),
I get results where the 'lm' option consistently calls arpackc many more times than the sm option.
I'm afraid I don't know the details of the algorithm, and so can't explain it in any deeper mathematical sense, but the page that I linked suggests ARPACK is best for matrices with some structure. Since matrices generated by rand have little structure, it is probably safe to assume the latter behaviour I described is not what you'd expect under normal operating conditions.
In short: it simply takes the algorithm more iterations to converge when you ask it for the smallest eigenvalues of a structured matrix. This being an iterative process, however, it very much depends on the actual data you give it.
Edit: There is a wealth of information and references about this method here, and the key to understanding exactly why this happens is surely contained somewhere therein.
The reason is actually much more simple and due to the basics of solving large sparse eigenvalue problems. These are all based on solving:
(1) A x = lam x
Most solution methods use some power law (e.g. a Krylov subspace spanned in both the Lanczos and Arnoldi methods)
The thing is that the a power series converge to the largest eigenvalue of (1). Therefore we have that the largest eigenvalues are found by the subspace spanned by: K^k = {A*r0,....,A^k*r0}, which requires only matrix vector multiplications (cheap).
To find the smallest, we have to reformulate (1) as follows:
(2) 1/lam x = A^(-1) x or A^(-1) x = invlam x
Now solving for the largest eigenvalue of (2) is equivalent to finding the smallest eigenvalue of (1). In this case the subspace is spanned by K^k = {A^(-1)*r0,....,A^(-k)*r0}, which requires solving several linear system (expensive!).
I use the function below to generate the betas for a given set of guess lambdas from my optimiser.
When running I often get the following warning message:
Warning: Matrix is singular to working precision.
In NSS_betas at 9
In DElambda at 19
In Individual_Lambdas at 36
I'd like to be able to exclude any betas that form a singular matrix form the solution set, however I don't know how to test for it?
I've been trying to use rcond() but I don't know where to make the cut off between singular and non singular?
Surely if Matlab is generating the warning message it already knows if the matrix is singular or not so if I could just find where that variable was stored I could use that?
function betas=NSS_betas(lambda,data)
mats=data.mats2';
lambda=lambda;
yM=data.y2';
nObs=size(yM,1);
G= [ones(nObs,1) (1-exp(-mats./lambda(1)))./(mats./lambda(1)) ((1-exp(-mats./lambda(1)))./(mats./lambda(1))-exp(-mats./lambda(1))) ((1-exp(-mats./lambda(2)))./(mats./lambda(2))-exp(-mats./lambda(2)))];
betas=G\yM;
r=rcond(G);
end
Thanks for the advice:
I tested all three examples below after setting the lambda values to be equal so guiving a singular matrix
if (~isinf(G))
r=rank(G);
r2=rcond(G);
r3=min(svd(G));
end
r=3, r2 =2.602085213965190e-16; r3= 1.075949299504113e-15;
So in this test rank() and rcond () worked assuming I take the benchmark values as given below.
However what happens when I have two values that are close but not exactly equal?
How can I decide what is too close?
rcond is the right way to go here. If it nears the machine precision of zero, your matrix is singular. I usually go with:
if( rcond(A) < 1e-12 )
% This matrix doesn't look good
end
You can experiment with a value that suites your needs, but taking the inverse of a matrix that is even close to singular with MATLAB can produce garbage results.
You could compare the result of rank(G) with the number of columns of G. If the rank is less than the column dimension, you will have a singular matrix.
you can also check this by:
min(svd(A))>eps
and verifying that the smallest singular value is larger than eps, or any other numerical tolerance that is relevant to your needs. (the code will return 1 or 0)
Here's more info about it...
Condition number (Maximal singular value/Minimal singular value) is another good method:
cond(A)
It uses svd. It should be as close to 1 as possible. Very large values mean that the matrix is almost singular. Inf means that it is precisely singular.
Note that almost all of the methods mentioned in other answers use somehow svd :
There are special tools designed for this problem, appropriately called "rank revealing matrix factorizations". To my best (albeit a little old) knowledge, a good enough way to decide whether a n x n matrix A is nonsingular is to go with
det(A) <> 0 <=> rank(A) = n
and use a rank-revealing QR factorization of A:
AP = QR
where Q is orthogonal, P is a permutation matrix and R is an upper triangular matrix with the property that the magnitude of the diagonal elements is decreased along the diagonal.
MATLAB documentation of SVD states that the diagonal matrix returned has singular values in decreasing order. Is there a way to find out what the natural ordering of singular values would be?
The reason I ask is because the singular values correspond to dimensions associated with rows of the input matrix.
No, the very definition of SVD does not introduce an ordering. Restricting the discussion to square X matrices and adopting the same notation of the cited matlab documentation, if X = U*S*V' is a SVD of X, then for every permutation matrix P, we can form a valid SVD as X = (U*P)*(P'*S*P)*(V*P)'. Presenting matrix S with descending values is just a matter of convenience: every permutation P'*S*P would do the same job.
As a side note: P*X = P*U*S*V' showing that a row permutation of matrix X does not change the singular values S, which can be considered independent from any row (or column) permutation of X.
I was hoping to get some idea of what is being asked here before responding. For example, the eigenshuffle tool I've posted on the file exchange allows you to reorder the eigenvalues and eigenvectors of a sequence of eigen-problems, so they are maximally consistent with each other in sequence. Perhaps your problem is similar, thus you might think of the singular values as functions that vary along with some parameter that drives a system.
But really, there is no natural ordering of the singular values that comes from the method used to compute the SVD. In fact, the only ordering that makes sense is the one that comes out - decreasing order. The order of the singular values is not dependent on the sequence of the rows of your matrix, as the question seems to vaguely imply, so I'm not sure what is meant there.
Feel free to modify the question in case you can make your needs clearer.
I'm running an optimization algorithm that requires calculation of the inverse of a matrix. The goal of the algorithm is to eliminate negative values from the matrix A and obtain the new matrix B. Basically, I start with known square matrices B and C of the same size.
I start by calculating the matrix A which is equal to:
A = B^-1 * C
Or in Matlab:
A = B\C;
I use this because Matlab told me B\C is more accurate than inv(B)*C.
The negative values in A are then divided by two and A is then normalised so that it's rows have length of 1. Using this new A, I calculate a new B with:
(1/N) * A * C' = B^-1
where N is just a scaling factor (# of columns in A). This new B would then be used again in the first step and these iterations continue until the negatives in A are gone.
My problem is I have to calculate B from the second equation and then normalise it.
invB = (1/N)*A*C';
B = inv(invB);
I've been calculating B using inv(B^-1) but after a few iterations I start getting messages that B^-1 is "close to singular or badly scaled."
This algorithm actually works for smaller matrices (around 70x70) but when it gets up to about 500x500 I start getting these messages.
Are there any better ways to calculate inv(B^-1)?
You should definitely head warnings about singular matrices. Results in numerical linear algebra tend to break down as you move toward matrices with high condition numbers. The underlying idea is if
A*b_1 = c
and we're actually solving the problem (because we are using approximate numbers when we use computers)
(A + matrix error)*b_2 = (c + vector error)
how close are b_1 and b_2 as a function of the matrix and vector errors? When A has small condition number b_1 and b_2 are close. When A has large condition number b_1 and b_2 are not close.
There is an informative piece of analysis you could do on your algorithm. At each iteration, after you've found B, find use Matlab to find the condition number of it. This is
cond(B)
You will likely see the number climb rapidly. This indicates that every time you iterate your algorithm, you should trust your result for B less and less.
Problems like this crop up all the time in numerical mathematics. If you'll be working with numerical algorithms frequently you should take some time to familiarize with the role of condition numbers in the field and preconditioning techniques as mentioned above. My preferred text for this is "Numerical Linear Algebra" by Lloyd Trefethen, but any text on Numerical Algebra should address some of these issues.
Best of luck,
Andrew
The main issue is that your matrix has a high condition number (i.e. really small rcond(B) in your case). This is due to the iterative structure in your algorithm, I guess. As you do each iteration your small singular values get smaller and smaller so your condition number grows exponentially. You should check preconditioning to avoid this kind of behavior.