Can we create a fixed-sized array w/o initialization - matlab

We have zeros(42,42), ones(42,42), inf(42,42), nan(42,42)...
Can we create an array w/o initialization and then fill it w/ numbers later? I know this could not be a good code that code analyzers can prove its safety. But in case an array is large, this could save some computation.

There is no such thing as an empty array of a fixed size. If you start filling the elements of the array later, and assign a value to element [k1, k2], then you will have an array of size at least [k1, k2], all doubles (by default). The reason is that matlab arrays are homogeneous containers, so every element has to be a proper double (or the corresponding type of the array). Sooner or later, your array has to be allocated, with zeros in case of unassigned elements. The most efficient thing to do in case of full matrices is to preallocate, which is what zeros(k1max,k2max) does. Actually, at least in older versions of MATLAB, it is faster to pre-allocate with mymat(k1max,k2max)=0;, i.e. by assigning a single zero to the bottom-right corner of your array (this automatically pre-allocates all the other elements between that and [1,1]. An other upside of pre-allocation is that MATLAB can reserve a contiguous block of memory for the whole array at once, which is the most efficient scenario possible.
What you might be looking for are sparse arrays. In case of large arrays with a huge number of zero elements, it's inefficient to store all those zeroes in memory, and to perform computations on them. MATLAB naturally treats sparse arrays, where only the nonzero elements are stored (for each column, so there's some overhead), which leads to huge memory efficiency and performance increase in case of very sparse matrices (where the number of nonzero elements is much smaller than the total number of elements).
An important upside of sparse matrices is that all arithmetic operations and almost all matrix operations are implemented for them, or at least they are automatically cast to full matrices. This makes their use almost identical to full matrices. And in line with your question, you only store the nonzero elements. Obviously this is only efficient if the matrix is sparse enough, otherwise the overhead from the bookkeeping of elements (and not using fully vectorized matrix operations) will make their use inefficient.
As a final remark, I just want to note that you can create empty double arrays as long as one of their dimensions is zero:
>> double.empty(100,0)
ans =
Empty matrix: 100-by-0
>> double.empty(100,100)
Error using double.empty
At least one dimension must be zero.
but this rarely has a place in practical applications.

Related

Is using matrices with many 0's and 1's considered vectorizing?

Suppose that for some needed transformation I have a mxn matrix A that consists of a few 1's and many 0's. If I were to transform a nx1 vector by A, is this considered a vectorized implementation?
While the matrix A mainly consists of 0's, does this still cause the same amount of FLOPs to occur?
Would it be wiser and more optimized to do the transformation needed another way? One that won't cause needless calculations such as 0*c?
The matrix is sparse. The number of operations for various things is lower when you store it properly and use the right routines.

MATLAB: Matrix multiplication with very large arrays

I need to perform a matrix multiplication with very large matrices, something like 5000x13 * 13x2000000. This leads to an error message as I don't have enough memory. I understand this.
Now, what's the best strategy to overcome this memory issue?
My suggestion is to split the array that you want to generate. Even if you generate the array, you can not store it! 5000 by 2 million is larger than array size limit in MATLAB! This limit applies to the size of each array, not the total size of all MATLAB arrays. So the problem is not coming from multiplication.
My suggestion is that you make four blocks of your output matrix each 5000 by 500K and write the multiplication of each block separately.
You could use tall arrays. They have some quirks regarding order of multiplication and such so you may want to look up the documentation. You weren't very specific on what you want exactly, so lets say you want to find the mean of your matrix multiplication. Here's how you could do it with tall arrays:
a = rand(5000,13).';
b = tall(rand(13,2000000).');
c = b * a;
d = mean(c,1);
e = gather(d).';
Note that in tall array multiplication, only one matrix is allowed to be tall and if the tall array is multiplied with another matrix, then the tall array must come first. This is why I used transposes quite liberally.

Matlab uint8 sparse

When creating a sparse matrix in Matlab it seems that you can create a sparse matrix either filled with logicals or double valued numbers.
While reading around I understood that Matlab does not have support for other type of sparse matrices, i.e. uint8 or other integers. In my application I know that max(values)==16, and the memory is a crucial thing, therefore I would like to have uint8 sparse matrices.
Is there a way of creating a unit8 sparse matrix?
If not (most likely), is there any apparent reason of why Matlab has not implemented uint8 sparse matrices?
I can see how using uint8 instead of a double would be no or little improvement.
A dense matrix is a continuous array so no extra indexing or structuring is required, the position of each element is given by its physical location in the memory.
But a sparse matrix should additionally require to store each element index, which in case of a 2D matrix would be two integers 32 or 64 bits in size to remember each element row and column number. On top of that there might be some implementation related overhead, such as a tree structure, or something else, used to make sparse matrix operations efficient.
So it is not 8 uint8 vs 64 double, eight times more less memory usage, but rather (8+32+32+log(n)+..) vs (64+32+32+log(n)+..), which i guess might end up being 10-20% savings at the best?
Furthermore each memory address now stores 64 bits if I remember correctly, which is one double or 8 uint8 packed together. This means a few extra bits needs to be used per entry just to remember which uint8 packed at that memory address we need, and adds some extra bit masking operations to perform.
So the guys at Mathworks probably did similar estimate, and decided to just do double sparse matrices.

Speed up gf(eye(x)) a.k.a. Speed up Galois field creation for sparse matrices

From the documentation (communications toolbox)
x_gf = gf(x,m) creates a Galois field array from the matrix x. The Galois field has 2^m elements, where m is an integer between 1 and 16.
Fine. The effort for big matrices grows with the number of elements of x. No surprise, as every element must be "touched" at some point.
Unfortunately, this means that the costs of gf(eye(n)) throw quadratically with n. Is there a way to profit from all the zeros in there?
PS: I need this to delete a row from a gf-Matrix, as the usual m(:c)= [] way does not work, and my idea of multiplying a gf-matrix with a cut unity matrix was surprisingly slow..
I don't have this toolbox, but maybe gf supports sparse-data inputs, which could drastically reduce your execution time in such a case.

Matlab: 'Block' matrix multiplication without resorting to repmat

I have a matrix A (of dimensions mxn) and vector b (of dimensions nx1).
I would like to construct a vector which is _repmat_((A*b),[C 1]), where C = n/m . I am using a lot of data and therefore n~100000 and C~10.
As you can see this is really block matrix multiplication without having to explicitly create the full A block matrix (dimensions nxn) as this easily exceeds available memory.
A is sparse and has already been converted using the function _sparse()_.
Is there a better way of doing this? (Considering speed and memory
footprint trade-off, I'd rather have a smaller memory footprint)
Usually if I was doing elementwise calculations, I would use bsxfun instead of using repmat to minimise memory footprint. As far as I know there is no equivalent bsxfun for matrix multiplication?
It looks like it is time for you to REALLY learn to use sparse matrices instead of wondering how to do this otherwise.
SPARSE block diagonal matrices do NOT take up a lot of memory if you create them correctly. I'll use my blktridiag function, which actually creates block tridiagonal matrices. Here I've used it to create random block diagonal matrices. I've set the off-diagonal elements to zero, so it really is block diagonal.
A = rand(3,3,100000);
tic,M = blktridiag(A,zeros(3,3,99999),zeros(3,3,99999));toc
Elapsed time is 0.478068 seconds.
And, while it is not small, the memory required is not that much more than twice the meemory required to store the diagonal eleemnts themselves.
whos A M
Name Size Bytes Class Attributes
A 3x3x100000 7200000 double
M 300000x300000 16800008 double sparse
Here about 17 megabytes, compared to 7 megabytes.
Note that blktridiag explicitly creates a sparse matrix directly.