Reusing symbolic factorization for sparse non-PSD matrix in matlab - matlab

I want to solve a large sparse matrix equation Ax=b in MATLAB many times, where A has the same sparsity pattern but different values each time. A is not positive definite but symmetric here.
The computation takes most of the time in my program, so I hope to accelerate it.
What is the best way to reuse the symbolic factorization in MATLAB?
I heard about several options, but not sure which is the best (or possible) way:
Eigen + mex
symbfact (not sure it is usable)
suitesparse
PARDISO
... Any other suggestions?
Any help is welcomed! Thank you! :)

I'm wondering the same thing. So far, I concluded:
https://au.mathworks.com/matlabcentral/answers/497464-symbolic-factorization-of-a-large-sparse-matrix
and I'm using
https://au.mathworks.com/matlabcentral/answers/497401-pardiso-reuse-symbolic-factorization
which is buggy, at least in the matlab interface:
Sometimes, the solution is inaccurate.
This could be due to zeros in the init iter that aren't accounted for in the pattern.
Also, I'm not sure exactly how does the symbolic factorization works and if it depends on the matrix values. If indeed it doesn't, then to resolve this, I can fill in eps in the Hessian pattern.
Sometimes, there are unexplained crashes.
I still feel (I'm monitoring it) that there are memory leaks even with factoring every time.
I'll try to update if I find something.

Related

Dot product with huge vectors

I am facing the following problem: I have a system of 160000 linear equations with 160000 variables. I am going to write two programs on conjugate gradient method and steepest descent method to solve it. The matrix is block tridiagonal with only 5 non zero diagonals, thus it's not necessary to create and store the matrix. But I am having the following problem: when I go to the iterarion stepe, there must be dot product of vectors involved. I have tried the following commands: dot(u,v), u'*v, which are commonly used. But when I run the program, MATLAB told me the data size is too large for the memory.
To resolve this problem, I tried to decompose the huge vector into sparse vectors with small support, then calculate the dot products of small vectors and finally glue them together. But it seems that this method is more complicated and not very efficient, and it is easy (especially for beginners like me) to make mistakes. I wonder if there're any more efficient ways to deal with this problem. Thanks in advance.

Matlab division of large matrices [duplicate]

I have this problem which requires solving for X in AX=B. A is of the order 15000 x 15000 and is sparse and symmetric. B is 15000 X 7500 and is NOT sparse. What is the fastest way to solve for X?
I can think of 2 ways.
Simplest possible way, X = A\B
Using for loop,
invA = A\speye(size(A))
for i = 1:size(B,2)
X(:,i) = invA*B(:,i);
end
Is there a better way than the above two? If not, which one is best between the two I mentioned?
First things first - never, ever compute inverse of A. That is never sparse except when A is a diagonal matrix. Try it for a simple tridiagonal matrix. That line on its own kills your code - memory-wise and performance-wise. And computing the inverse is numerically less accurate than other methods.
Generally, \ should work for you fine. MATLAB does recognize that your matrix is sparse and executes sparse factorization. If you give a matrix B as the right-hand side, the performance is much better than if you only solve one system of equations with a b vector. So you do that correctly. The only other technical thing you could try here is to explicitly call lu, chol, or ldl, depending on the matrix you have, and perform backward/forward substitution yourself. Maybe you save some time there.
The fact is that the methods to solve linear systems of equations, especially sparse systems, strongly depend on the problem. But in almost any (sparse) case I imagine, factorization of a 15k system should only take a fraction of a second. That is not a large system nowadays. If your code is slow, this probably means that your factor is not that sparse sparse anymore. You need to make sure that your matrix is properly reordered to minimize the fill (added non-zero entries) during sparse factorization. That is the crucial step. Have a look at this page for some tests and explanations on how to reorder your system. And have a brief look at example reorderings at this SO thread.
Since you can answer yourself which of the two is faster, I'll try yo suggest the next options.
Solve it using a GPU. Plenty of details can be found online, including this SO post, a matlab benchmarking of A/b, etc.
Additionally, there's the MATLAB add-on of LAMG (Lean Algebraic Multigrid). LAMG is a fast graph Laplacian solver. It can solve Ax=b in O(m) time and storage.
If your matrix A is symmetric positive definite, then here's what you can do to solve the system efficiently and stably:
First, compute the cholesky decomposition, A=L*L'. Since you have a sparse matrix, and you want to exploit it to accelerate the inversion, you should not apply chol directly, which would destroy the sparsity pattern. Instead, use one of the reordering method described here.
Then, solve the system by X = L'\(L\B)
Finally, if are not dealing with potential complex values, then you can replace all the L' by L.', which gives a bit further acceleration because it's just trying to transpose instead of computing the complex conjugate.
Another alternative would be the preconditioned conjugate gradient method, pcg in Matlab. This one is very popular in practice, because you can trade off speed for accuracy, i.e. give it less number of iterations, and it will give you a (usually pretty good) approximate solution. You also never need to store the matrix A explicitly, but just be able to compute matrix-vector product with A, if your matrix doesn't fit into memory.
If this takes forever to solve in your tests, you are probably going into virtual memory for the solve. A 15k square (full) matrix will require 1.8 gigabytes of RAM to store in memory.
>> 15000^2*8
ans =
1.8e+09
You will need some serious RAM to solve this, as well as the 64 bit version of MATLAB. NO factorization will help you unless you have enough RAM to solve the problem.
If your matrix is truly sparse, then are you using MATLAB's sparse form to store it? If not, then MATLAB does NOT know the matrix is sparse, and does not use a sparse factorization.
How sparse is A? Many people think that a matrix that is half full of zeros is "sparse". That would be a waste of time. On a matrix that size, you need something that is well over 99% zeros to truly gain from a sparse factorization of the matrix. This is because of fill-in. The resulting factorized matrix is almost always nearly full otherwise.
If you CANNOT get more RAM (RAM is cheeeeeeeeep you know, certainly once you consider the time you have wasted trying to solve this) then you will need to try an iterative solver. Since these tools do not factorize your matrix, if it is truly sparse, then they will not go into virtual memory. This is a HUGE savings.
Since iterative tools often require a preconditioner to work as well as possible, it can take some study to find the best preconditioner.

Efficient way to solve for X in AX=B in MATLAB when both A and B are big matrices

I have this problem which requires solving for X in AX=B. A is of the order 15000 x 15000 and is sparse and symmetric. B is 15000 X 7500 and is NOT sparse. What is the fastest way to solve for X?
I can think of 2 ways.
Simplest possible way, X = A\B
Using for loop,
invA = A\speye(size(A))
for i = 1:size(B,2)
X(:,i) = invA*B(:,i);
end
Is there a better way than the above two? If not, which one is best between the two I mentioned?
First things first - never, ever compute inverse of A. That is never sparse except when A is a diagonal matrix. Try it for a simple tridiagonal matrix. That line on its own kills your code - memory-wise and performance-wise. And computing the inverse is numerically less accurate than other methods.
Generally, \ should work for you fine. MATLAB does recognize that your matrix is sparse and executes sparse factorization. If you give a matrix B as the right-hand side, the performance is much better than if you only solve one system of equations with a b vector. So you do that correctly. The only other technical thing you could try here is to explicitly call lu, chol, or ldl, depending on the matrix you have, and perform backward/forward substitution yourself. Maybe you save some time there.
The fact is that the methods to solve linear systems of equations, especially sparse systems, strongly depend on the problem. But in almost any (sparse) case I imagine, factorization of a 15k system should only take a fraction of a second. That is not a large system nowadays. If your code is slow, this probably means that your factor is not that sparse sparse anymore. You need to make sure that your matrix is properly reordered to minimize the fill (added non-zero entries) during sparse factorization. That is the crucial step. Have a look at this page for some tests and explanations on how to reorder your system. And have a brief look at example reorderings at this SO thread.
Since you can answer yourself which of the two is faster, I'll try yo suggest the next options.
Solve it using a GPU. Plenty of details can be found online, including this SO post, a matlab benchmarking of A/b, etc.
Additionally, there's the MATLAB add-on of LAMG (Lean Algebraic Multigrid). LAMG is a fast graph Laplacian solver. It can solve Ax=b in O(m) time and storage.
If your matrix A is symmetric positive definite, then here's what you can do to solve the system efficiently and stably:
First, compute the cholesky decomposition, A=L*L'. Since you have a sparse matrix, and you want to exploit it to accelerate the inversion, you should not apply chol directly, which would destroy the sparsity pattern. Instead, use one of the reordering method described here.
Then, solve the system by X = L'\(L\B)
Finally, if are not dealing with potential complex values, then you can replace all the L' by L.', which gives a bit further acceleration because it's just trying to transpose instead of computing the complex conjugate.
Another alternative would be the preconditioned conjugate gradient method, pcg in Matlab. This one is very popular in practice, because you can trade off speed for accuracy, i.e. give it less number of iterations, and it will give you a (usually pretty good) approximate solution. You also never need to store the matrix A explicitly, but just be able to compute matrix-vector product with A, if your matrix doesn't fit into memory.
If this takes forever to solve in your tests, you are probably going into virtual memory for the solve. A 15k square (full) matrix will require 1.8 gigabytes of RAM to store in memory.
>> 15000^2*8
ans =
1.8e+09
You will need some serious RAM to solve this, as well as the 64 bit version of MATLAB. NO factorization will help you unless you have enough RAM to solve the problem.
If your matrix is truly sparse, then are you using MATLAB's sparse form to store it? If not, then MATLAB does NOT know the matrix is sparse, and does not use a sparse factorization.
How sparse is A? Many people think that a matrix that is half full of zeros is "sparse". That would be a waste of time. On a matrix that size, you need something that is well over 99% zeros to truly gain from a sparse factorization of the matrix. This is because of fill-in. The resulting factorized matrix is almost always nearly full otherwise.
If you CANNOT get more RAM (RAM is cheeeeeeeeep you know, certainly once you consider the time you have wasted trying to solve this) then you will need to try an iterative solver. Since these tools do not factorize your matrix, if it is truly sparse, then they will not go into virtual memory. This is a HUGE savings.
Since iterative tools often require a preconditioner to work as well as possible, it can take some study to find the best preconditioner.

Optimization stops prematurely (MATLAB)

I'm trying my best to work it out with fmincon in MATLAB. When I call the function, I get one of the two following errors:
Number of function evaluation exceeded, or
Number of iteration exceeded.
And when I look at the solution so far, it is way off the one intended (I know so because I created a minimum vector).
Now even if I increase any of the tolerance constraint or max number of iterations, I still get the same problem.
Any help is appreciated.
First, if your problem can actually be cast as linear or quadratic programming, do that first.
Otherwise, have you tried seeding it with different starting values x0? If it's starting in a bad place, it may be much harder to get to the optimum.
If it's possible for you to provide the gradient of the function, that can help the optimizer tremendously (though obviously only if you can find it some way other than numerical differentiation). Similarly, if you can provide the (full or sparse) Hessian relatively cheaply, you're golden.
You can also try using a different algorithm in the solver.
Basically, fmincon by default has almost no info about the function it's trying to optimize, and providing more can be extremely helpful. If you can tell us more about the objective function, we might be able to give more tips.
The L1 norm is not differentiable. That can make it difficult for the algorithm to converge to a point where one of the residuals is zero. I suspect this is why number of iterations limits are exceeded. If your original problem is
min norm(residual(x),1)
s.t. Aeq*x=beq
you can reformulate the problem differentiably, as follows
min sum(b)
s.t. -b(i)<=residual(x,i)<=b(i)
Aeq*x=beq
where residual(x,i) is the i-th residual, x is the original vector of unknowns, and b is a further unknown vector of bounds that you add to the problem.

MATLAB calculates INV wrong (for singular matrices)

MATLAB calculate INV wrong sometimes:
See this example
[ 8617412867597445*2^(-25), 5859840749966268*2^(-28)]
[ 5859840749966268*2^(-28), 7969383419954132*2^(-32)]
If you put this in MATLAB it doesn't have inverse but in s calculator it has one.
What is going on?
Please read What every scientist should know about floating point arithmetic
Next, don't compute the inverse anyway. An inverse matrix is almost never necessary, except in textbooks, where it is convenient to write. Sadly, many authors do not appreciate this fact anyway, because they had learned from textbooks by other people who also failed to understand that an inverse matrix is a bad thing to do in general.
Since this matrix is numerically singular in double precision arithmetic, the inverse of that matrix is meaningless.
Use of the matlab backslash operator will be better and faster in general than will inverse. Or use pinv, which will be more robust to problems.
Hi I wanted to comment on Woodchips' answer but since I'm a new user I can't seem to do that, that is one very interesting article and I must read it in more detail when I have the time...
With regards to matrix inversion, you could always use the 'cond' command to calculate the condition number of the matrix, for a non-singular matrix the value should be approaching unity. As Woodchips suggested, 'pinv' does come in handy if you need to find the psuedo-inverse of a non-square matrix.