When creating a sparse matrix in Matlab it seems that you can create a sparse matrix either filled with logicals or double valued numbers.
While reading around I understood that Matlab does not have support for other type of sparse matrices, i.e. uint8 or other integers. In my application I know that max(values)==16, and the memory is a crucial thing, therefore I would like to have uint8 sparse matrices.
Is there a way of creating a unit8 sparse matrix?
If not (most likely), is there any apparent reason of why Matlab has not implemented uint8 sparse matrices?
I can see how using uint8 instead of a double would be no or little improvement.
A dense matrix is a continuous array so no extra indexing or structuring is required, the position of each element is given by its physical location in the memory.
But a sparse matrix should additionally require to store each element index, which in case of a 2D matrix would be two integers 32 or 64 bits in size to remember each element row and column number. On top of that there might be some implementation related overhead, such as a tree structure, or something else, used to make sparse matrix operations efficient.
So it is not 8 uint8 vs 64 double, eight times more less memory usage, but rather (8+32+32+log(n)+..) vs (64+32+32+log(n)+..), which i guess might end up being 10-20% savings at the best?
Furthermore each memory address now stores 64 bits if I remember correctly, which is one double or 8 uint8 packed together. This means a few extra bits needs to be used per entry just to remember which uint8 packed at that memory address we need, and adds some extra bit masking operations to perform.
So the guys at Mathworks probably did similar estimate, and decided to just do double sparse matrices.
Related
i need to generate a large square binary sparse matrix in MATLAB (about 100k x 100k). but i get the "out of memory" error.
Can anybody help?
A 100,000 x 100,000 matrix contains 10,000,000,000 doubles. At 8 bytes each, that's 80,000,000,000 bytes, i.e. about 74.5058 Gb.
I seriously doubt you have 80Gb of RAM (let alone, allocated only to matlab), so presumably you'll have to find another way to process your data in chunks.
EDIT Apologies, I only just noticed the sparse bit.
If you try to initialise your sparse matrix as sparse (zeros( 100000,100000)), this will fail for the above reason (i.e. you're asking octave / matlab to first store a 75Gb matrix of zeros, and only then convert it to a sparse matrix).
Instead, you should initialise your 100,000x100,000 sparse matrix like so:
s = sparse(100000,100000);
and then proceed to fill in its contents.
Assuming the number of nonzero elements in your sparse matrix is low enough that they can be handled easily with your system's memory, and that you have a way of filling in the necessary values you have in mind without allocating a big bad matrix first, then this should work fine.
Have a look at the sparse function for other ways of initialising a sparse matrix from data.
Try increasing the size of the swap file of your system.
We have zeros(42,42), ones(42,42), inf(42,42), nan(42,42)...
Can we create an array w/o initialization and then fill it w/ numbers later? I know this could not be a good code that code analyzers can prove its safety. But in case an array is large, this could save some computation.
There is no such thing as an empty array of a fixed size. If you start filling the elements of the array later, and assign a value to element [k1, k2], then you will have an array of size at least [k1, k2], all doubles (by default). The reason is that matlab arrays are homogeneous containers, so every element has to be a proper double (or the corresponding type of the array). Sooner or later, your array has to be allocated, with zeros in case of unassigned elements. The most efficient thing to do in case of full matrices is to preallocate, which is what zeros(k1max,k2max) does. Actually, at least in older versions of MATLAB, it is faster to pre-allocate with mymat(k1max,k2max)=0;, i.e. by assigning a single zero to the bottom-right corner of your array (this automatically pre-allocates all the other elements between that and [1,1]. An other upside of pre-allocation is that MATLAB can reserve a contiguous block of memory for the whole array at once, which is the most efficient scenario possible.
What you might be looking for are sparse arrays. In case of large arrays with a huge number of zero elements, it's inefficient to store all those zeroes in memory, and to perform computations on them. MATLAB naturally treats sparse arrays, where only the nonzero elements are stored (for each column, so there's some overhead), which leads to huge memory efficiency and performance increase in case of very sparse matrices (where the number of nonzero elements is much smaller than the total number of elements).
An important upside of sparse matrices is that all arithmetic operations and almost all matrix operations are implemented for them, or at least they are automatically cast to full matrices. This makes their use almost identical to full matrices. And in line with your question, you only store the nonzero elements. Obviously this is only efficient if the matrix is sparse enough, otherwise the overhead from the bookkeeping of elements (and not using fully vectorized matrix operations) will make their use inefficient.
As a final remark, I just want to note that you can create empty double arrays as long as one of their dimensions is zero:
>> double.empty(100,0)
ans =
Empty matrix: 100-by-0
>> double.empty(100,100)
Error using double.empty
At least one dimension must be zero.
but this rarely has a place in practical applications.
From the documentation (communications toolbox)
x_gf = gf(x,m) creates a Galois field array from the matrix x. The Galois field has 2^m elements, where m is an integer between 1 and 16.
Fine. The effort for big matrices grows with the number of elements of x. No surprise, as every element must be "touched" at some point.
Unfortunately, this means that the costs of gf(eye(n)) throw quadratically with n. Is there a way to profit from all the zeros in there?
PS: I need this to delete a row from a gf-Matrix, as the usual m(:c)= [] way does not work, and my idea of multiplying a gf-matrix with a cut unity matrix was surprisingly slow..
I don't have this toolbox, but maybe gf supports sparse-data inputs, which could drastically reduce your execution time in such a case.
I have a matrix A (of dimensions mxn) and vector b (of dimensions nx1).
I would like to construct a vector which is _repmat_((A*b),[C 1]), where C = n/m . I am using a lot of data and therefore n~100000 and C~10.
As you can see this is really block matrix multiplication without having to explicitly create the full A block matrix (dimensions nxn) as this easily exceeds available memory.
A is sparse and has already been converted using the function _sparse()_.
Is there a better way of doing this? (Considering speed and memory
footprint trade-off, I'd rather have a smaller memory footprint)
Usually if I was doing elementwise calculations, I would use bsxfun instead of using repmat to minimise memory footprint. As far as I know there is no equivalent bsxfun for matrix multiplication?
It looks like it is time for you to REALLY learn to use sparse matrices instead of wondering how to do this otherwise.
SPARSE block diagonal matrices do NOT take up a lot of memory if you create them correctly. I'll use my blktridiag function, which actually creates block tridiagonal matrices. Here I've used it to create random block diagonal matrices. I've set the off-diagonal elements to zero, so it really is block diagonal.
A = rand(3,3,100000);
tic,M = blktridiag(A,zeros(3,3,99999),zeros(3,3,99999));toc
Elapsed time is 0.478068 seconds.
And, while it is not small, the memory required is not that much more than twice the meemory required to store the diagonal eleemnts themselves.
whos A M
Name Size Bytes Class Attributes
A 3x3x100000 7200000 double
M 300000x300000 16800008 double sparse
Here about 17 megabytes, compared to 7 megabytes.
Note that blktridiag explicitly creates a sparse matrix directly.
I'm doing CDMA spreading in MATLAB. And I'm having an Out of Memory error in MATLAB despite upgrading my RAM, preallocating arrays, etc.
Is there an alternate method to kron (Kronecker tensor product) in MATLAB? Here is my code:
tempData = kron( Data, walsh);
Data is a M by 1 matrix and walsh (spread code) is a 8 by 1 matrix.
My Data is comprises of real and imaginary parts, e.g.: 0.000 + 1.000i or 1.000 + 0.000i in double format.
This call to kron is not memory intensive. I know, your problem seems so trivial. However, you don't tell us what is M. For very large values of M, you are simply trying to create too large of an array to fit in memory. It is very easy to forget that your computer is not infinitely large or infinitely fast. We get spoiled when we see "giga" in front of everything.
If you absolutely must do this for that value of M, then you probably need the 64 bit version of MATLAB, AND more memory will always help once you do that.
Another option is to make Data single precision, if you can afford the loss in precision. This will at least give you an extra factor of 2. In order to provide the best help, we need to know the size of M.