MATLAB sparse matrix solvers? memory errors - matlab

In the context of a finite element problem, I have a 12800x12800 sparse matrix. I'm trying to solve the linear system just using MATLAB's \ operator to solve and I get an out of memory error using mldivide. So I'm just wondering if there's a way to speed this up.
I mean, will something like LU factorization actually help here in terms of not getting the memory error anymore? I increased the heap size to 256 GB in preferences, which is the max I can get it to, and I still get the out of memory error.
Also, just a general question. I have 8GB of RAM on my laptop right now. Will upgrading to 16GB help at all? Or maybe something I can do to allocate more memory to MATLAB? I'm pretty unfamiliar with this stuff.

According to this and this you have some options to avoid out of memory problem in matlab:
Increase operating system's virtual memory
Give Higher priority to MATLAB process in task manager
Use 64-bit version of MATLAB
Few months ago, I was working on integer programming in matlab. I faced "out of memory" problem, so I used sparse matrices and followed the mentioned tips, finally the problem is solved!

Are you locked in to using mldivide? Sounds like the perfect situation for an iterative method - bicg, gmres etc?

While backslash takes advantage of the sparsity of A, the qr method it uses produces full matrices that require (number_occupied_elements)^3 memory to be allocated. A few things you can try
If you're dividing sparse matrices with a few diagonals, you can try try to solve the system with forward/backwards substitution
Try breaking the problem into a smaller you break up the problem into a smaller
Run whos to see what elements are occupying your memory before you start the matrix division, can any of these be cleared beforehand?
Not applicable to your problem as you've stated it here, but if your system is defined (A has more rows than columns) than using the pseudo-inverse (A.'*A)\(A.'*b) produces a result using the smaller columns dimension
As for adding additional memory; Matlab32 uses 2^32 bytes of memory (4 Gb) so increasing the physical RAM on your computer won't help unless you're using the the 64 bit version.

MATLAB \ usually tries several methods to solve a problem. First, if it sees that if the structure of your matrix is symmetric it tries a Cholesky factorization. After several steps if it can not find a suitable answer current version of Matlab uses UMFPACK Suitsparse package.
UMFPack is a specific LU implemenation, and it is known for its speed and good usage of memory in practice. It also tries to reduce fill-in and keep matrix as sparse as possible. It is why MATLAB uses this code.
(I am working on UMFPACK for my PhD under supervision of Dr Tim Davis, its creator)
Therefor, using another LU factorization won't help. It is an LU factorization already.
One of the easiest way to solve your problem is testing your problem on another device with a better memory to see if it works.
I guess matlab do some garbage collection and waste some memory, so if you use the UMFPACK directly it might help you. You can either implement it in C/C++ or use MATLAB interface for it. Take a look at the SuitSparse package.
Based on the structure of your matrix I think MATLAB tries to use Cholesky; I don't know what is the strategy of MATLAB if Cholesky fails in memory management. Take into account that Cholesky is easier to manage in terms of memory.
There are other packages that might help you as well. CSparse is a lightweight package and it might help. There are other famouse packages that might be helpful; search for superLU.

Related

matlab Out of memory . Type HELP MEMORY for your options

I use pcacov command in matlab with pcacov(20000*20000 matrix) input parameters, but matlab can't handel it with memory and show error:
pcacov matlab Out of memory. Type HELP MEMORY for your options.
How can i solve this problem with coding or setting in the matlab and with out add or change any hardware memory or change pc.
I do not think that you can fix this problem without adding more memory. The matrix that MATLAB uses is as big as your physical memory. You can consider some algorithms use partitioned matrix, or you may try tall arrays in MATLAB. But, these technologies make research more complex. 

lapack simple vs expert driver speed comparison

I want to use lapack to solve problems of type Ax=b, least square, cholesky decomposition and SVD decomposition etc. The manual says two type of drivers exist: simple and expert where expert driver gives more output information but at the cost of more workspace.
I want to know about speed difference between the two drivers.
Is it something like both are same, except for time consumed in copying/saving data to pointers in expert driver mode which is not that significant.
It depends on the driver. For linear square solve ?GESV and ?GESVX the difference is that a condition number estimate is also returned and more importantly the solution is fed to ?GERFS for a refined solution to reduce the error.
Often a relatively(!) considerable slowdown is expected from expert routines. You can test it yourself by using the same input. For GESV/GESVX comparison we had a significant slow down which is now fixed in SciPy 1.0 and solution refining will be skipped while keeping the condition number reporting.
See https://github.com/scipy/scipy/issues/7847 for more information.

Solving Ax=b where A is too big to be stored in a single array

Problem: A is square, full rank, sparse and banded. It has way too many elements to be stored as a single matrix in Matlab (at least ~4.6*1018 and ideally ~1040, both of which exceed max array size. EDIT: A is stored as sparse, and the problem is not with limited memory but with limited number of elements). Therefore I have to store it as a collection of smaller arrays (rows/diagonals/columns/blocks).
Looking for: a way to solve Ax=b, with A given as a collection of smaller arrays. Ideally in Matlab but not a must.
Alternatively, if not in Matlab: maybe there's a program that can store and solve such a big A?
Found so far: methods if A is tri/pentadiagonal, but my A has N diagonals. Also found something about partitioning A to blocks, but couldn't find a way to then solve a linear system with these blocks.
p.s. The system is 64-bit.
Thanks everyone!
Not using Matlab would allow you to store larger arrays. ROOT is an open source framework developed at CERN that has C++ and Python interfaces and a variety of solvers. It is also capable of handling huge datasets and has a variety of visualization and analysis tools as well.
If you are interested in writing C or Fortran BLAS(Basic Linear Algebra Subroutines) and CBLAS would be good options. There are many open source and proprietary implementations of BLAS that should be available for most Linux/UNIX distributions. There are also plenty of examples showing how to use the BLAS subroutines in C and Fortran code available online.
If you have access to MATLAB's Parallel Computing Toolbox together with MATLAB Distributed Computing Server, you may be able to store A as a distributed array, in other words a single array whose elements are distributed across the memories of multiple machines in a cluster. You can call MATLAB's backslash command directly on a distributed array, and MATLAB handles the parallelization for you.
I wanted to put this as a comment, but I think it is better to state it as an answer.
You have a serious problem. It is not only a problem of indexing, it is also a problem of memory: 4.6x10^18 is huge. That is 4.6 exa elements. If you store them as real single precision, you need 4x4.6 exabyte of memory. A computer which such a huge memory, does not yet exists to my knowledge. You will need to gather all the storage (hard disk, not RAM) of a significant proportion of all computers in the world to store such a matrix. Think about it. Going to 10^40 elements is nearly impractical for the time being. With your 64 bit computers, the 64 bit address space can bearly address 4.6x10^18 elements. 64 bits address (or integer) makes it possible to directly index 2^64 elements which is roughly 16x10^18. So you have to think twice.
Going back to the problem itself, there are chances that you can turn your matrix into an implicit operator. By implicit operator, I mean, you do not need to store it, because it has a pattern that you know how to reproduce, or you can apply it to a vector without actually forming the matrix. If you have the matrix in hand, you are very likely in this situation, considering what I said above.
If that is the case, to solve your problem, you simply need to use an iterative solver and provide a black box that does your matrix multiplication. Going to other directions might be a waste of your time.

Linear Programming Solver for MATLAB, similar to cplexlp or linprog

I'm using MATLAB 2010b 64bit and its cplex integration to solve an engineering problem. However, because of the memory leak of cplex, memory usage exceeds acceptable limits with cplex (100+GBs including virtual memory) hence I am not able to solve my problem. You can see a similar post here.
Then I tried to use MATLAB linprog from the optimization toolbox but the result was disappointing. The running time of the algorithm for a small problem instance was increased from 80 cpu sec to 2600 cpu sec.
Now, I need an LP solver integration to MATLAB which is similar to CPLEX or linprog. By "similar" I mean the way it accepts data input in the form (F, A, B, Aeq, Beq, ...etc).
I must be able to use it in loops. Do you have any suggestions for that?
I would be very surprised if there was a memory leak in cplex. If you have a large problem then the memory will grow with any sensible solver. Is there perhaps a memory leak in the interface to cplex? How big is your problem? Are you running multi threaded as each thread will take a copy of the problem and hence will eat a lot more memory.
You should not be surprised to find that other solvers take a lot longer than cplex to solve your problem. Certainly the free solvers will be very much slower than cplex for any large problem.
After some trials to fix MATLAB/CPLex API's memory usage problem (memory leak) and after referring to some studies I decided to switch to Gurobi solver. For pure LP problems, it seems to be slightly slower compared to CPlex but this can be due to the way I use Gurobi. Someone may find Gurobi faster compared to CPlex. I suggested that on my previous posts under different questions. Here are some academic studies[Analysis of commercial and free and open source solvers for linear optimization problems][1]
[1] : http://www.statistik.tuwien.ac.at/forschung/CS/CS-2012-1complete.pdf

How do I obtain the eigenvalues of a huge matrix (size: 2x10^5)

I have a matrix of size 200000 X 200000 .I need to find the eigen values for this .I was using matlab till now but as the size of the matrix is unhandleable by matlab i have shifted to perl and now even perl is unable to handle this huge matrix it is saying out of memory.I would like to know if i can find out the eigen values of this matrix using some other programming language which can handle such huge data. The elements are not zeros mostly so no option of going for sparse matrix. Please help me in solving this.
I think you may still have luck with MATLAB. Take a look into their distributed computing toolbox. You'd need some kind of parallel environment, a computing cluster.
If you don't have a computational cluster, you might look into distributed eigenvalue/vector calculation methods that could be employed on Amazon EC2 or similar.
There is also a discussion of parallel eigenvalue calculation methods here, which may direct you to better libraries and programming approaches than Perl.