i need to generate a large square binary sparse matrix in MATLAB (about 100k x 100k). but i get the "out of memory" error.
Can anybody help?
A 100,000 x 100,000 matrix contains 10,000,000,000 doubles. At 8 bytes each, that's 80,000,000,000 bytes, i.e. about 74.5058 Gb.
I seriously doubt you have 80Gb of RAM (let alone, allocated only to matlab), so presumably you'll have to find another way to process your data in chunks.
EDIT Apologies, I only just noticed the sparse bit.
If you try to initialise your sparse matrix as sparse (zeros( 100000,100000)), this will fail for the above reason (i.e. you're asking octave / matlab to first store a 75Gb matrix of zeros, and only then convert it to a sparse matrix).
Instead, you should initialise your 100,000x100,000 sparse matrix like so:
s = sparse(100000,100000);
and then proceed to fill in its contents.
Assuming the number of nonzero elements in your sparse matrix is low enough that they can be handled easily with your system's memory, and that you have a way of filling in the necessary values you have in mind without allocating a big bad matrix first, then this should work fine.
Have a look at the sparse function for other ways of initialising a sparse matrix from data.
Try increasing the size of the swap file of your system.
Related
I have a dataset in which I have a 25,000 by 25,000 matrix for each timepoint with around 60% of the cells being zeroes. I need to be able to take the eigenvalues of my matrices and reorder the columns and rows. Currently, I am running into memory errors and am considering implementing sparse matrices to save memory. Is my dataset sparse enough to do this? Would I simply use the 'sparse' and 'eigs' commands to manipulate these matrices? Thank you in advance!
While implementing a third order Markov chain, I got an error, insinuating that there is a maximum dimension of sparse matrices.
E.g. asking for the following sparse matrix
T = spalloc(1e12,1e12,1e5);
I get the error
Error using sparse
Requested 1000000000000x1000000000000 (7450.6GB) array exceeds
maximum array size preference. Creation of arrays greater than
this limit may take a long time and cause MATLAB to become
unresponsive. See array size limit or preference panel for more
information.
From how I understand sparse matrices, the matrix should take up ~3*8*1e5 = 2.4 MB of memory (2 doubles for the coordinates and a double for each non-zero entry). Which is significantly less than the proclaimed 7450.6GB in the error.
Is my understanding of sparse matrices wrong? or what is going wrong? And probably most importantly, are there any way to circumvent this problem?
I have a 40x43 matrix and I would like to use this matrix a building block to generate larger matrix.
I want to generate a structure like the image attached and the building block is the 40x43 matrix. I tried using [A zeros(20,43); zeros(20,43) A] but as I had guessed, the horzcat did't work. I would ideally like to use this block 1000 times to extend the structure of matrix. Could anyone tell me an efficient way to concatenate the small matrix?
Try using kron. This performs what is known as the Kronecker product such that for two matrices A and B, the result is:
In this case, we can replicate what you want exactly by setting A to be the identity matrix of size 1000 x 1000 and B to be the matrix you want to replicate. However, to promote computational savings and memory usage, make sure you use the sparse version of the identity matrix. This will convert the output matrix to sparse form. If you want to replicate this 1000 times, you are creating a 40000 x 43000 matrix and this requires 13.76 GB of memory and you probably don't have enough memory available for this matrix. Since most of the elements are zero, use the sparse version instead:
N = 1000;
B = kron(speye(N), A);
I am filling a sparse matrix P (230k,290k) with values coming from a text file which I read line by line, here is the (simplified) code
while ...
C = textscan(text_line,'%d','delimiter',',','EmptyValue', 0);
line_number = line_number+1;
P(line_number,:)=C{1};
end
the problem I have is that while at the beginning the
P(line_number,:)=C{1};
statement is fast, after a few thousands lines become exterely slow, I guess because Matlab need to find the memory space to allocate every time. Is there a way to pre-allocate memory with sparse matrixes? I don't think so but maybe I am missing something. Any other advise which can speed up the operation (e.g. having a lot of free RAM can make the difference?)
There's a sixth input argument to sparse that tells the number of nonzero elements in the matrix. That's used by Matlab to preallocate:
S = sparse(i,j,s,m,n,nzmax) uses vectors i, j, and s to generate an
m-by-n sparse matrix such that S(i(k),j(k)) = s(k), with space
allocated for nzmax nonzeros.
So you could initiallize with
P = sparse([],[],[],230e3,290e3,nzmax);
You can make a guess about the number of nonzeros (perhaps checking file size?) and use that as nzmax. If it turns you need more nonzero elements in the end, Matlab will preallocate on the fly (slowly).
By far the fastest way to generate a sparse matrix wihtin matlab is to load all the values in at once, then generate the sparse matrix in one call to sparse. You have to load the data and arrange it into vectors defining the row and column indices and values for each filled cell. You can then call sparse using the S = sparse(i,j,s,m,n) syntax.
I have a matrix A (of dimensions mxn) and vector b (of dimensions nx1).
I would like to construct a vector which is _repmat_((A*b),[C 1]), where C = n/m . I am using a lot of data and therefore n~100000 and C~10.
As you can see this is really block matrix multiplication without having to explicitly create the full A block matrix (dimensions nxn) as this easily exceeds available memory.
A is sparse and has already been converted using the function _sparse()_.
Is there a better way of doing this? (Considering speed and memory
footprint trade-off, I'd rather have a smaller memory footprint)
Usually if I was doing elementwise calculations, I would use bsxfun instead of using repmat to minimise memory footprint. As far as I know there is no equivalent bsxfun for matrix multiplication?
It looks like it is time for you to REALLY learn to use sparse matrices instead of wondering how to do this otherwise.
SPARSE block diagonal matrices do NOT take up a lot of memory if you create them correctly. I'll use my blktridiag function, which actually creates block tridiagonal matrices. Here I've used it to create random block diagonal matrices. I've set the off-diagonal elements to zero, so it really is block diagonal.
A = rand(3,3,100000);
tic,M = blktridiag(A,zeros(3,3,99999),zeros(3,3,99999));toc
Elapsed time is 0.478068 seconds.
And, while it is not small, the memory required is not that much more than twice the meemory required to store the diagonal eleemnts themselves.
whos A M
Name Size Bytes Class Attributes
A 3x3x100000 7200000 double
M 300000x300000 16800008 double sparse
Here about 17 megabytes, compared to 7 megabytes.
Note that blktridiag explicitly creates a sparse matrix directly.