Maximum dimension of sparse matrix - matlab

While implementing a third order Markov chain, I got an error, insinuating that there is a maximum dimension of sparse matrices.
E.g. asking for the following sparse matrix
T = spalloc(1e12,1e12,1e5);
I get the error
Error using sparse
Requested 1000000000000x1000000000000 (7450.6GB) array exceeds
maximum array size preference. Creation of arrays greater than
this limit may take a long time and cause MATLAB to become
unresponsive. See array size limit or preference panel for more
information.
From how I understand sparse matrices, the matrix should take up ~3*8*1e5 = 2.4 MB of memory (2 doubles for the coordinates and a double for each non-zero entry). Which is significantly less than the proclaimed 7450.6GB in the error.
Is my understanding of sparse matrices wrong? or what is going wrong? And probably most importantly, are there any way to circumvent this problem?

Related

Preallocate space for array with known maximum size

Is there anyway to allot a fixed chunk of memory for a growing/decreasing array in a loop? I know the range of sizes it can have, i.e. the min and max dimensions.
I could just allot a matrix of max size as follows,
A = zeros(max,max);
but here's the problem with that approach. I have matrix multiplication and inverse operations inside the loop. On top of that, I am using slicing operation (complete row/column selections)
A[:,i] = data(i).x;
B = A\P;
C = A*W;
The allocation of maximum size does not go well these operations (size mismatch error).
So, I am trying to allocate a memory chunk corresponding to a max dimension, but want to utilize only a part of it.
I know this can be achieved using loops for matrix operations, instead of vector operations in Matlab but that would be inefficient. (in fact I am not sure how the inverse operation would be implemented in a loop).
Any help is appreciated.

Memory efficiency of max function in matlab

I have a very large matrix which is close to maxing out available memory, and my script fails due to insufficient memory to execute it. At some point, I have to compute the maximum value of said matrix, D. Is there a difference, memory wise, between max(D(:)) and max(max(D))?
Yes, there is.
max(D(:))
reshapes the matrix (no copy of the data is made) and computes the maximum value of the resulting vector.
max(max(D))
computes a maximum projection of the matrix, yielding a row vector, and then computes the maximum value of that vector.
Thus, the second option needs intermediate memory that the first one doesn’t.

how to generate a large square binary matrix in MATLAB

i need to generate a large square binary sparse matrix in MATLAB (about 100k x 100k). but i get the "out of memory" error.
Can anybody help?
A 100,000 x 100,000 matrix contains 10,000,000,000 doubles. At 8 bytes each, that's 80,000,000,000 bytes, i.e. about 74.5058 Gb.
I seriously doubt you have 80Gb of RAM (let alone, allocated only to matlab), so presumably you'll have to find another way to process your data in chunks.
EDIT Apologies, I only just noticed the sparse bit.
If you try to initialise your sparse matrix as sparse (zeros( 100000,100000)), this will fail for the above reason (i.e. you're asking octave / matlab to first store a 75Gb matrix of zeros, and only then convert it to a sparse matrix).
Instead, you should initialise your 100,000x100,000 sparse matrix like so:
s = sparse(100000,100000);
and then proceed to fill in its contents.
Assuming the number of nonzero elements in your sparse matrix is low enough that they can be handled easily with your system's memory, and that you have a way of filling in the necessary values you have in mind without allocating a big bad matrix first, then this should work fine.
Have a look at the sparse function for other ways of initialising a sparse matrix from data.
Try increasing the size of the swap file of your system.

Speed up gf(eye(x)) a.k.a. Speed up Galois field creation for sparse matrices

From the documentation (communications toolbox)
x_gf = gf(x,m) creates a Galois field array from the matrix x. The Galois field has 2^m elements, where m is an integer between 1 and 16.
Fine. The effort for big matrices grows with the number of elements of x. No surprise, as every element must be "touched" at some point.
Unfortunately, this means that the costs of gf(eye(n)) throw quadratically with n. Is there a way to profit from all the zeros in there?
PS: I need this to delete a row from a gf-Matrix, as the usual m(:c)= [] way does not work, and my idea of multiplying a gf-matrix with a cut unity matrix was surprisingly slow..
I don't have this toolbox, but maybe gf supports sparse-data inputs, which could drastically reduce your execution time in such a case.

Matlab: 'Block' matrix multiplication without resorting to repmat

I have a matrix A (of dimensions mxn) and vector b (of dimensions nx1).
I would like to construct a vector which is _repmat_((A*b),[C 1]), where C = n/m . I am using a lot of data and therefore n~100000 and C~10.
As you can see this is really block matrix multiplication without having to explicitly create the full A block matrix (dimensions nxn) as this easily exceeds available memory.
A is sparse and has already been converted using the function _sparse()_.
Is there a better way of doing this? (Considering speed and memory
footprint trade-off, I'd rather have a smaller memory footprint)
Usually if I was doing elementwise calculations, I would use bsxfun instead of using repmat to minimise memory footprint. As far as I know there is no equivalent bsxfun for matrix multiplication?
It looks like it is time for you to REALLY learn to use sparse matrices instead of wondering how to do this otherwise.
SPARSE block diagonal matrices do NOT take up a lot of memory if you create them correctly. I'll use my blktridiag function, which actually creates block tridiagonal matrices. Here I've used it to create random block diagonal matrices. I've set the off-diagonal elements to zero, so it really is block diagonal.
A = rand(3,3,100000);
tic,M = blktridiag(A,zeros(3,3,99999),zeros(3,3,99999));toc
Elapsed time is 0.478068 seconds.
And, while it is not small, the memory required is not that much more than twice the meemory required to store the diagonal eleemnts themselves.
whos A M
Name Size Bytes Class Attributes
A 3x3x100000 7200000 double
M 300000x300000 16800008 double sparse
Here about 17 megabytes, compared to 7 megabytes.
Note that blktridiag explicitly creates a sparse matrix directly.