Matlab Zero Tolerance in rank function - matlab

I am wondering if there is technical or theoretical reason on why Matlab on rank function considers as zero the value max(size(A))*eps(norm(A)). Can you please provide some intuition?
Thank you!

The following answer is not based on proper mathematical reasoning, it is just some speculations (as you were asking for intuition):
norm(A) is the order of magnitude of the matrix entries.
eps(norm(A)) is thus the accuracy that the floating point representation of the matrix entries typically has.
Now, consider you add N numbers that should theoretically add up to zero, but each of them has an error of eps to it ... I think we would expect an error in the order of sqrt(N) * eps for the result.
Then, given that the algorithm that computes the rank performs N^2 operations on the matrix entries (where N is its size) to result in a number that is checked against zero, the error that we would then expect is what you stated in your question.
What I don't know, is the algorithm that Matlab uses really of complexity N^2?

Related

Rank of matrix contradicts the number of independent columns

I have 50x49 matrix A that has 49 linearly independent columns. However, my software (octave) tells me its rank is 44:
Is it due to some computational error? If so, then how to prevent such errors?
If the software was able to correctly calculate rref(A), then why did it fail with rank(A)? Does it mean that calculating rank(A) is more error prone than calculating rref(A), or vice versa? I mean rref(A) actually tells you the rank, but here's a contradiction.
P.S. I've checked, Python makes the same error.
EDIT 1: Here is the matrix A itself. The first 9 columns were given. The rest was obtained with polynomial features.
EDIT 2: I was able to found a similar issue. Here is 10x10 matrix B of rank 10 (and octave calculates its rank correctly). However, octave says that rank(B * B) = 9 which is impossible.
The distinction between an invertible matrix (i.e. full rank) and a non-invertible one is clear-cut in theory, but not so in practice. A matrix B with large condition number (as in your example) can be inverted, but computing the inverse is numerically unstable. It roughly corresponds to B having a determinant that is "small" (using an appropriate, relative measure of "small"), so the matrix is almost singular. As a result, the inverse matrix will be computed with bad accuracy. In your example B, the condition number (computed with cond) is 2.069e9.
Another way to look at this is: when the condition number is large, it well could be that B is "really" singular, but small numerical errors from previous computations make it look barely non-singular. So you can't be sure.
The rank and rref functions use different algorithms (singular-value decomposition for rank, Gauss-Jordan elimination with partial pivoting for rref). For well-behaved matrices the numerical errors will be small in both cases, and the results will be consistent. But for a bad-conditioned matrix the numerical errors will be large and potentially different in each case, giving inconsistent results.
This is a well known issue with numerical algebra. In general, avoid inverting matrices with large condition number.

Choosing a suitable plaintext_modulus

In choosing parameters such as plaintext_modulus, is there any good strategy? (aside from guess-and-check until the output looks correct)
In particular, I'm experimenting with IntegerEncoder with BFV. My (potentially-wrong) understanding is that the plaintext_modulus is not the modulus for the integer being encoded, but the modulus for each coefficient in the polynomial representation.
With B=2, it looks like these coefficients will just be 0 or 1. However, after operations like add and multiply are applied, this clearly is no longer the case. Is there a good way to determine a good bound for the coefficients, in order to pick plaintext_modulus?
My (potentially-wrong) understanding is that the plaintext_modulus is not the modulus for the integer being encoded, but the modulus for each coefficient in the polynomial representation.
This is the correct way of thinking when using IntegerEncoder. Note, however, that when using BatchEncoder (PolyCRTBuilder in SEAL 2.*) the situation is exactly the opposite: each slot in the plaintext vector is an integer modulo poly_modulus.
With B=2, it looks like these coefficients will just be 0 or 1. However, after operations like add and multiply are applied, this clearly is no longer the case. Is there a good way to determine a good bound for the coefficients, in order to pick plaintext_modulus?
The whole point of IntegerEncoder is that fresh encodings have as small coefficients as possible, delaying plain_modulus overflow and allowing you to use smaller plain_modulus (implies smaller noise growth). SEAL 2.* had an automatic parameter selection tool that performed heuristic upper bound estimates on noise growth and plaintext coefficient growth, and basically did exactly what you want. Unfortunately these estimates were performed on a per-operation basis, causing overestimates in the earlier operations to blow up in later stages of the computation. As a result, the estimates were not very tight for more than the simplest computations and in many cases the parameters this tool provided were oversized.
To estimate the plaintext coefficient growth in multiplications, let's consider two polynomials p(x) and q(x). Obviously the product will have degree exactly equal to deg(p)+deg(q)---that part is easy. If |P| denotes the infinity norm of a polynomial P (absolute value of largest coefficient), then:
|p*q| <= min{deg(p)+1, deg(q)+1} * |p||q|.
Actually, SEAL 2.* is a little bit more precise here. Instead of using the degrees, it uses the number of non-zero coefficients in these polynomials. This makes a big difference when the polynomials are sparse, in which case the contribution from cross-terms is much smaller and a better bound is:
|p*q| <= min{#(non_zero_coeffs(p)), #(non_zero_coeffs(q))} * |p||q|.
A deeper analysis of coefficient growth in IntegerEncoder-like encoders is done in https://eprint.iacr.org/2016/250 by Costache et al., which you may want to look at.

how to convert a matrix to a diagonally dominant matrix using pivoting in Matlab

Hi I am trying to solve a linear system of the following type:
A*x=b,
where A is the coefficient matrix,
x is the vectors of unknowns and
b is the vector of solution.
The coefficient matrix (A) is a n-by-n sparse matrix, with even zeros in the diagonal. In order to solve this system in an accurate way I am using an iterative method in Matlab called bicgstab (Biconjugate gradients stabilized method).
This coefficient matrix (A) has a
det(A)=-4.1548e-05 and a rcond(A)= 1.1331e-04.
Therefore the matrix is ill-conditioned. I first try to perform a scaling and the results where:
det(A)= -1.2612e+135 but the rcond(A)=5.0808e-07...
Therefore the matrix is still ill-conditioned... I verify and the sum of all absolute value of the non-diagonal elements where 163.60 and the sum of all absolute value of the diagonal elements where 32.49... Therefore the matrix of coefficient is not diagonally dominant and will not converge using my function bicgstab...
I am looking for someone that can help me with performing a pivoting to the coefficient matrix (A) so it can be diagonally dominant. Or any advice to solve this problem....
Thanks for the help.
First there should be quite a few things noted here:
Don't use the determinant to estimate the "amount of singularity" of your matrix. The determinant is the product of all the eigenvalues of your matrix, and therefore its scaling can be wildly misleading compared to a much better measure like the condition number, leading to the next point..
your conditioning (according to rcond) isn't that bad, are you working with single or double precision? Large problems can routinely get condition numbers in this range and still be quite solvable, but of course this depends on a very complicated interaction of many factors, of which the condition number plays only a small part. This leads to another complicated point:
Diagonal dominance may not help you at all here. BiCGStab as far as I know does not require diagonal dominance for its convergence, and also I don't think diagonal dominance is known even to help it. Diagonal dominance is usually an assumption made by other iterative methods such as the Jacobi method or Gauss-Seidel. Actually the convergence behavior of BiCGStab is not very well understood at all, and it is usually only used when memory is a very severe problem but conjugate gradients is not applicable.
If you are really interested in using a Krylov method (such as BiCGStab) to solve your problem, then you generally need to have more understanding of where your matrix is coming from so that you can choose a sensible preconditioner.
So this calls for a bit more information. Do you know more about this matrix? Is it arising from some kind of physical problem? Do you know for example if it is symmetric or positive definite (I will assume not both because you are not using CG).
Let me end with some actionable advice which is very generic, and so not necessarily optimal:
If memory is not an issue, consider using restarted GMRES instead of BiCGStab. My experience is that GMRES has much more robust convergence.
Try an approximate factorization preconditioner such as ILU. MATLAB has a function for this built in.

Why eigs( 'lm') is much faster than eigs('sm')

I use eigs to calculate the eigen vectors of sparse square matrices which are large (tens of thousands).
What I want is the smallest set of eigen vectors.
But
eigs(A, 10, 'sm') % Note: A is the matrix
runs very slow.
However, using eigs(A, 10, 'lm') gives me the answer relatively faster.
And as I tried, replacing 10 with A_width in eigs(A, 10, 'lm') so that this includes all the eigen vectors, doesn't solve this problem, 'cause this make it the as slow as using 'sm'.
So, I want to know why calculating the smallest vectors(using 'sm') is much slower than calculating the largest?
BTW, if you have any idea about how to use eigs with 'sm' as fast as with 'lm', please tell me that.
The algorithm used in pretty much any standard eigs function is (some variation of) the Lanczos algorithm. It is iterative and the first iterations give you the largest eigenvalues. This explains pretty much every observation you make:
Largest eigenvalues take the least amount of iterations,
Smallest eigenvalues take the maximum amount of iterations,
All eigenvalues also take the maximum amount of iterations.
There are tricks to "fool" eigs into calculating the smallest eigenvalues by actually making them the largest eigenvalues of another problem. This is usually accomplished by a shift parameter. Skimming over the Matlab documentation for eigs, I see that they have a sigma parameter, which might help you. Note the same documentation recommends proper eig if the matrix fits into memory, as eigs has its numerical quirks.
Since eigs is actually an m-file function, we can profile it. I have run a couple of basic tests, and it depends very much on the nature of the data in the matrix. If we run the profiler separately on the following two lines of code:
eigs(eye(1000), 10, 'lm'), and
eigs(eye(1000), 10, 'sm'),
then in the first instance it calls arpackc (the main function that does the work - according to the comments in eigs it's probably from here) a total of 22 times. In the second instance it is called 103 times.
On the other hand, trying it with
eigs(rand(1000), 10, 'lm'), and
eigs(rand(1000), 10, 'sm'),
I get results where the 'lm' option consistently calls arpackc many more times than the sm option.
I'm afraid I don't know the details of the algorithm, and so can't explain it in any deeper mathematical sense, but the page that I linked suggests ARPACK is best for matrices with some structure. Since matrices generated by rand have little structure, it is probably safe to assume the latter behaviour I described is not what you'd expect under normal operating conditions.
In short: it simply takes the algorithm more iterations to converge when you ask it for the smallest eigenvalues of a structured matrix. This being an iterative process, however, it very much depends on the actual data you give it.
Edit: There is a wealth of information and references about this method here, and the key to understanding exactly why this happens is surely contained somewhere therein.
The reason is actually much more simple and due to the basics of solving large sparse eigenvalue problems. These are all based on solving:
(1) A x = lam x
Most solution methods use some power law (e.g. a Krylov subspace spanned in both the Lanczos and Arnoldi methods)
The thing is that the a power series converge to the largest eigenvalue of (1). Therefore we have that the largest eigenvalues are found by the subspace spanned by: K^k = {A*r0,....,A^k*r0}, which requires only matrix vector multiplications (cheap).
To find the smallest, we have to reformulate (1) as follows:
(2) 1/lam x = A^(-1) x or A^(-1) x = invlam x
Now solving for the largest eigenvalue of (2) is equivalent to finding the smallest eigenvalue of (1). In this case the subspace is spanned by K^k = {A^(-1)*r0,....,A^(-k)*r0}, which requires solving several linear system (expensive!).

Determinants of huge matrices in MATLAB

from a simulation problem, I want to calculate complex square matrices on the order of 1000x1000 in MATLAB. Since the values refer to those of Bessel functions, the matrices are not at all sparse.
Since I am interested in the change of the determinant with respect to some parameter (the energy of a searched eigenfunction in my case), I overcome the problem at the moment by first searching a rescaling factor for the studied range and then calculate the determinants,
result(k) = det(pre_factor*Matrix{k});
Now this is a very awkward solution and only works for matrix dimensions of, say, maximum 500x500.
Does anybody know a nice solution to the problem? Interfacing to Mathematica might work in principle but I have my doubts concerning feasibility.
Thank you in advance
Robert
Edit: I did not find a convient solution to the calculation problem since this would require changing to a higher precision. Instead, I used that
ln det M = trace ln M
which is, when I derive it with respect to k
A = trace(inv(M(k))*dM/dk)
So I at least had the change of the logarithm of the determinant with respect to k. From the physical background of the problem I could derive constraints on A which in the end gave me a workaround valid for my problem. Unfortunately I do not know if such a workaround could be generalized.
You should realize that when you multiply a matrix by a constant k, then you scale the determinant of the matrix by k^n, where n is the dimension of the matrix. So for n = 1000, and k = 2, you scale the determinant by
>> 2^1000
ans =
1.07150860718627e+301
This is of course a huge number, so you might expect that it should fail, since in double precision, MATLAB will only represent floating point numbers as large as realmax.
>> realmax
ans =
1.79769313486232e+308
There is no need to do all the work of recomputing that determinant, not that computing the determinant of a huge matrix like that is a terribly well-posed problem anyway.
If speed is not a concern, you may want to use det(e^A) = e^(tr A) and take as A some scaling constant times your matrix (so that A - I has spectral radius less than one).
EDIT: In MatLab, the log of a matrix (logm) is calculated via trigonalization. So it is better for you to compute the eigenvalues of your matrix and multiply them (or better, add their logarithm). You did not specify whether your matrix was symmetric or not: if it is, finding eigenvalues are easier than if it is not.
You said the current value of the determinant is about 10^-300.
Are you trying to get the determinant at a certain value, say 1? If so, rescaling is awkward: the matrix you are considering is ill-conditioned, and, considering the precision of the machine, you should consider the output determinant to be zero. It is impossible to get a reliable inverse in other words.
I would suggest to modify the columns or lines of the matrix rather than rescale it.
I used R to make a small test with a random matrix (random normal values), it seems the determinant should be clearly non-zero.
> n=100
> M=matrix(rnorm(n**2),n,n)
> det(M)
[1] -1.977380e+77
> kappa(M)
[1] 2318.188
This is not strictly a matlab solution, but you might want to consider using Mahout. It's specifically designed for large-scale linear algebra. (1000x1000 is no problem for the scales it's used to.)
You would call into java to pass data to/from Mahout.