Preallocate space for array with known maximum size - matlab

Is there anyway to allot a fixed chunk of memory for a growing/decreasing array in a loop? I know the range of sizes it can have, i.e. the min and max dimensions.
I could just allot a matrix of max size as follows,
A = zeros(max,max);
but here's the problem with that approach. I have matrix multiplication and inverse operations inside the loop. On top of that, I am using slicing operation (complete row/column selections)
A[:,i] = data(i).x;
B = A\P;
C = A*W;
The allocation of maximum size does not go well these operations (size mismatch error).
So, I am trying to allocate a memory chunk corresponding to a max dimension, but want to utilize only a part of it.
I know this can be achieved using loops for matrix operations, instead of vector operations in Matlab but that would be inefficient. (in fact I am not sure how the inverse operation would be implemented in a loop).
Any help is appreciated.

Related

Maximum dimension of sparse matrix

While implementing a third order Markov chain, I got an error, insinuating that there is a maximum dimension of sparse matrices.
E.g. asking for the following sparse matrix
T = spalloc(1e12,1e12,1e5);
I get the error
Error using sparse
Requested 1000000000000x1000000000000 (7450.6GB) array exceeds
maximum array size preference. Creation of arrays greater than
this limit may take a long time and cause MATLAB to become
unresponsive. See array size limit or preference panel for more
information.
From how I understand sparse matrices, the matrix should take up ~3*8*1e5 = 2.4 MB of memory (2 doubles for the coordinates and a double for each non-zero entry). Which is significantly less than the proclaimed 7450.6GB in the error.
Is my understanding of sparse matrices wrong? or what is going wrong? And probably most importantly, are there any way to circumvent this problem?

Memory efficiency of max function in matlab

I have a very large matrix which is close to maxing out available memory, and my script fails due to insufficient memory to execute it. At some point, I have to compute the maximum value of said matrix, D. Is there a difference, memory wise, between max(D(:)) and max(max(D))?
Yes, there is.
max(D(:))
reshapes the matrix (no copy of the data is made) and computes the maximum value of the resulting vector.
max(max(D))
computes a maximum projection of the matrix, yielding a row vector, and then computes the maximum value of that vector.
Thus, the second option needs intermediate memory that the first one doesn’t.

MATLAB: Matrix multiplication with very large arrays

I need to perform a matrix multiplication with very large matrices, something like 5000x13 * 13x2000000. This leads to an error message as I don't have enough memory. I understand this.
Now, what's the best strategy to overcome this memory issue?
My suggestion is to split the array that you want to generate. Even if you generate the array, you can not store it! 5000 by 2 million is larger than array size limit in MATLAB! This limit applies to the size of each array, not the total size of all MATLAB arrays. So the problem is not coming from multiplication.
My suggestion is that you make four blocks of your output matrix each 5000 by 500K and write the multiplication of each block separately.
You could use tall arrays. They have some quirks regarding order of multiplication and such so you may want to look up the documentation. You weren't very specific on what you want exactly, so lets say you want to find the mean of your matrix multiplication. Here's how you could do it with tall arrays:
a = rand(5000,13).';
b = tall(rand(13,2000000).');
c = b * a;
d = mean(c,1);
e = gather(d).';
Note that in tall array multiplication, only one matrix is allowed to be tall and if the tall array is multiplied with another matrix, then the tall array must come first. This is why I used transposes quite liberally.

how to generate a large square binary matrix in MATLAB

i need to generate a large square binary sparse matrix in MATLAB (about 100k x 100k). but i get the "out of memory" error.
Can anybody help?
A 100,000 x 100,000 matrix contains 10,000,000,000 doubles. At 8 bytes each, that's 80,000,000,000 bytes, i.e. about 74.5058 Gb.
I seriously doubt you have 80Gb of RAM (let alone, allocated only to matlab), so presumably you'll have to find another way to process your data in chunks.
EDIT Apologies, I only just noticed the sparse bit.
If you try to initialise your sparse matrix as sparse (zeros( 100000,100000)), this will fail for the above reason (i.e. you're asking octave / matlab to first store a 75Gb matrix of zeros, and only then convert it to a sparse matrix).
Instead, you should initialise your 100,000x100,000 sparse matrix like so:
s = sparse(100000,100000);
and then proceed to fill in its contents.
Assuming the number of nonzero elements in your sparse matrix is low enough that they can be handled easily with your system's memory, and that you have a way of filling in the necessary values you have in mind without allocating a big bad matrix first, then this should work fine.
Have a look at the sparse function for other ways of initialising a sparse matrix from data.
Try increasing the size of the swap file of your system.

Matlab: 'Block' matrix multiplication without resorting to repmat

I have a matrix A (of dimensions mxn) and vector b (of dimensions nx1).
I would like to construct a vector which is _repmat_((A*b),[C 1]), where C = n/m . I am using a lot of data and therefore n~100000 and C~10.
As you can see this is really block matrix multiplication without having to explicitly create the full A block matrix (dimensions nxn) as this easily exceeds available memory.
A is sparse and has already been converted using the function _sparse()_.
Is there a better way of doing this? (Considering speed and memory
footprint trade-off, I'd rather have a smaller memory footprint)
Usually if I was doing elementwise calculations, I would use bsxfun instead of using repmat to minimise memory footprint. As far as I know there is no equivalent bsxfun for matrix multiplication?
It looks like it is time for you to REALLY learn to use sparse matrices instead of wondering how to do this otherwise.
SPARSE block diagonal matrices do NOT take up a lot of memory if you create them correctly. I'll use my blktridiag function, which actually creates block tridiagonal matrices. Here I've used it to create random block diagonal matrices. I've set the off-diagonal elements to zero, so it really is block diagonal.
A = rand(3,3,100000);
tic,M = blktridiag(A,zeros(3,3,99999),zeros(3,3,99999));toc
Elapsed time is 0.478068 seconds.
And, while it is not small, the memory required is not that much more than twice the meemory required to store the diagonal eleemnts themselves.
whos A M
Name Size Bytes Class Attributes
A 3x3x100000 7200000 double
M 300000x300000 16800008 double sparse
Here about 17 megabytes, compared to 7 megabytes.
Note that blktridiag explicitly creates a sparse matrix directly.