Sparse Matrix Assignment becomes very slow in Matlab - matlab

I am filling a sparse matrix P (230k,290k) with values coming from a text file which I read line by line, here is the (simplified) code
while ...
C = textscan(text_line,'%d','delimiter',',','EmptyValue', 0);
line_number = line_number+1;
P(line_number,:)=C{1};
end
the problem I have is that while at the beginning the
P(line_number,:)=C{1};
statement is fast, after a few thousands lines become exterely slow, I guess because Matlab need to find the memory space to allocate every time. Is there a way to pre-allocate memory with sparse matrixes? I don't think so but maybe I am missing something. Any other advise which can speed up the operation (e.g. having a lot of free RAM can make the difference?)

There's a sixth input argument to sparse that tells the number of nonzero elements in the matrix. That's used by Matlab to preallocate:
S = sparse(i,j,s,m,n,nzmax) uses vectors i, j, and s to generate an
m-by-n sparse matrix such that S(i(k),j(k)) = s(k), with space
allocated for nzmax nonzeros.
So you could initiallize with
P = sparse([],[],[],230e3,290e3,nzmax);
You can make a guess about the number of nonzeros (perhaps checking file size?) and use that as nzmax. If it turns you need more nonzero elements in the end, Matlab will preallocate on the fly (slowly).

By far the fastest way to generate a sparse matrix wihtin matlab is to load all the values in at once, then generate the sparse matrix in one call to sparse. You have to load the data and arrange it into vectors defining the row and column indices and values for each filled cell. You can then call sparse using the S = sparse(i,j,s,m,n) syntax.

Related

Fast way to set many values of sparse matrix

I have a sparse 5018x5018 matrix in MATLAB, which has about 100k values set to 1 (i.e., about 99.6% empty).
I'm trying to flip roughly 5% of those zeros to ones (i.e., about 1.25m entries). I have the x and y indices in the matrix I want to flip.
Here is what I have done:
sizeMat=size(network);
idxToReplace=sub2ind(sizeMat,x_idx, y_idx);
network(idxToReplace) = 1;
This is incredibly slow, in particular the last line. Is there any way to make this operation run noticeably faster, preferably without using mex files?
This should be faster:
idxToReplace=sparse(x_idx,y_idx,ones(size(x_idx),size(matrix,1),size(matrix,2)); % Create a sparse with ones at locations
network=network+idxToReplace; % Add the two matrices
I think your solution is very slow because you create a 1.26e6 logical array with your points and then store them in the sparse matrix. In my solution, you only create a sparse matrix and just sum the two.

Storing a sparse matrix in blocks in Matlab?

I have to perform this operation:
N = A'*P*A
The structure of the P matrix is block diagonal while the A matrix is largely sparse (also in a banded structure). The multiplication is performed in blocks. But the problem is storage.
The N matrix is too huge to store in full (out of memory when trying to allocate). So, I want to store in a sparse fashion. While the sparse command generates only the values in row,column format, can it be applied to store banded matrices with the row column as the index of the block?
I have tried spalloc given in the this question but it hasnt helped storing the row and index of the block.
Thank you.
Image for A P A' formation
The problem lies in the blocks. The blocks are themselves sparse. So is it possible to make blocks as sparse matrices themselves while saving.
So, if a block has a row = 1 and col = 1, then can this be done?
N(row,col) = sparse(A'*P*A)
There may be some additional tricks to play but the first thing to try is to make sure the full matrix N is never created in memory. The immediate problem is that if you call sparse(A'*P*A) then you multiple A'*P then (A'*P)*A and only then do you make it sparse and take out the zeros. Right before making it sparse, the entire non-sparse matrix representation of N is in memory. To force MATLAB to be smarter do the following:
SA = sparse(A);
N = SA'*sparse(P)*SA;
whos N
You should see that N is sparse but, more importantly, each multiplication result is sparse as well because you are multiplying a sparse matrix times a sparse matrix.

Matlab: 'Block' matrix multiplication without resorting to repmat

I have a matrix A (of dimensions mxn) and vector b (of dimensions nx1).
I would like to construct a vector which is _repmat_((A*b),[C 1]), where C = n/m . I am using a lot of data and therefore n~100000 and C~10.
As you can see this is really block matrix multiplication without having to explicitly create the full A block matrix (dimensions nxn) as this easily exceeds available memory.
A is sparse and has already been converted using the function _sparse()_.
Is there a better way of doing this? (Considering speed and memory
footprint trade-off, I'd rather have a smaller memory footprint)
Usually if I was doing elementwise calculations, I would use bsxfun instead of using repmat to minimise memory footprint. As far as I know there is no equivalent bsxfun for matrix multiplication?
It looks like it is time for you to REALLY learn to use sparse matrices instead of wondering how to do this otherwise.
SPARSE block diagonal matrices do NOT take up a lot of memory if you create them correctly. I'll use my blktridiag function, which actually creates block tridiagonal matrices. Here I've used it to create random block diagonal matrices. I've set the off-diagonal elements to zero, so it really is block diagonal.
A = rand(3,3,100000);
tic,M = blktridiag(A,zeros(3,3,99999),zeros(3,3,99999));toc
Elapsed time is 0.478068 seconds.
And, while it is not small, the memory required is not that much more than twice the meemory required to store the diagonal eleemnts themselves.
whos A M
Name Size Bytes Class Attributes
A 3x3x100000 7200000 double
M 300000x300000 16800008 double sparse
Here about 17 megabytes, compared to 7 megabytes.
Note that blktridiag explicitly creates a sparse matrix directly.

out of memory error when using diag function in matlab

I have an array of valued double M where size(M)=15000
I need to convert this array to a diagonal matrix with command diag(M)
but i get the famous error out of memory
I run matlab with option -nojvm to gain memory space
and with the optin 3GB switch on windows
i tried also to convert my array to double precision
but the problem persist
any other idea?
There are much better ways to do whatever you're probably trying to do than generating the full diagonal matrix (which will be extremely sparse).
Multiplying that matrix, which has 225 million elements, by other matrices will also take a very long time.
I suggest you restructure your algorithm to take advantage of the fact that:
diag(M)(a, b) =
M(a) | a == b
0 | a != b
You'll save a huge amount of time and memory and whoever is paying you will be happier.
This is what a diagonal matrix looks like:
Every entry except those along the diagonal of the matrix (the ones where row index equals the column index) is zero. Relating this example to your provided values, diag(M) = A and M(n) = An
Use saprse matrix
M = spdiags( M, 0, numel(M), numel(M) );
For more info see matlab doc on spdiags and on sparse matrices in general.
If you have an n-by-n square matrix, M, you can directly extract the diagonal elements into a row vector via
n = size(M,1); % Or length(M), but this is more general
D = M(1:n+1:end); % 1-by-n vector containing diagonal elements of M
If you have an older version of Matlab, the above may even be faster than using diag (if I recall, diag wasn't always a compiled function). Then, if you need to save memory and only need the diagonal of M and can get rid of the rest, you can do this:
M(:) = 0; % Zero out M
M(1:n+1:end) = D; % Insert diagonal elements back into M
clear D; % Clear D from memory
This should not allocate much more than about (n^2+n)*8 = n*(n+1)*8 bytes at any one time for double precision values (some will needed for indexing operations). There are other ways to do the above that might save a bit more if you need a (full, non-sparse) n-by-n diagonal matrix, but there's no way to get around that you'll need n^2*8 bytes at a minimum just to store the matrix of doubles.
However, you're still likely to run into problems. I'd investigate sparse datatypes as #user2379182 suggests. Or rework you algorithms. Or better yet, look into obtaining 64-bit Matlab and/or a 64-bit OS!

Matlab: a smart way to create a sparse matrix

I have to create a matlab matrix that is much bigger that my phisical memory, and i want to take advantage of the sparsity.
This matrix is really really sparse [say N elements in an NxN matrix], and my ram is enought for this. I create the matrix in this way:
A=sparse(zeros(N));
but it goes out of memory.
Do you know the right way to create this matrix?
zeros(N) is creating an NxN matrix, which is not sparse, hence you are running out of memory. Your code is equivalent to
temp = zeros(N)
A = sparse(temp)
Just do sparse(N,N).
Creating an all zeros sparse matrix, and then modifying it is extremely inefficient in matlab.
Instead of doing something like:
A = sparse(N,N) % or even A = sparse([],[],[],N,N,N)
A(1:N,7) = 1:N
It is much more efficient to construct the matrix in triplet form. That is,
construct the column and row indices and the nonzero entries first, then
form the matrix. For example,
i = 1:N;
j = 7*ones(1,N);
x = 1:N;
A = sparse(i,j,x,N,N);
I'd actually recommend the full syntax of sparse([],[],[],N,N,N).
It's useful to preallocate if you know the maximum number of nonzero elements as otherwise you'll get reallocs when you insert new elements.