Matlab: 'Block' matrix multiplication without resorting to repmat - matlab

I have a matrix A (of dimensions mxn) and vector b (of dimensions nx1).
I would like to construct a vector which is _repmat_((A*b),[C 1]), where C = n/m . I am using a lot of data and therefore n~100000 and C~10.
As you can see this is really block matrix multiplication without having to explicitly create the full A block matrix (dimensions nxn) as this easily exceeds available memory.
A is sparse and has already been converted using the function _sparse()_.
Is there a better way of doing this? (Considering speed and memory
footprint trade-off, I'd rather have a smaller memory footprint)
Usually if I was doing elementwise calculations, I would use bsxfun instead of using repmat to minimise memory footprint. As far as I know there is no equivalent bsxfun for matrix multiplication?

It looks like it is time for you to REALLY learn to use sparse matrices instead of wondering how to do this otherwise.
SPARSE block diagonal matrices do NOT take up a lot of memory if you create them correctly. I'll use my blktridiag function, which actually creates block tridiagonal matrices. Here I've used it to create random block diagonal matrices. I've set the off-diagonal elements to zero, so it really is block diagonal.
A = rand(3,3,100000);
tic,M = blktridiag(A,zeros(3,3,99999),zeros(3,3,99999));toc
Elapsed time is 0.478068 seconds.
And, while it is not small, the memory required is not that much more than twice the meemory required to store the diagonal eleemnts themselves.
whos A M
Name Size Bytes Class Attributes
A 3x3x100000 7200000 double
M 300000x300000 16800008 double sparse
Here about 17 megabytes, compared to 7 megabytes.
Note that blktridiag explicitly creates a sparse matrix directly.

Related

repmat vs simple matrix multiplication in MATLAB

Let v be a row vector of length n. The goal is to create a matrix A with m rows that are all equal to v.
MATLAB has a function for this that is called repmat. Possible code would be
A = repmat(v,[m 1])
There is an alternative equally concise way using simple matrix operations
A = ones(m,1)*v
Is any of the two methods preferable for large m and n?
Lets compare them!
When testing algorithms 2 metrics are important: time, and memory.
Lets start with time:
Clearly repmat wins!
Memory:
profile -memory on
for m=1000:1000:50000
f1=#()(repmat(v,[m 1]));
f2=#()(ones(m,1)*v);
ii=ii+1;
t1(ii)=timeit(f1);
t2(ii)=timeit(f2);
end
profreport
It seems that both take the same amount of memory. However, the profiler is known for not showing all the memory, so we can not fully trust it.
Still, it is clear that repmat is better
You should use repmat().
Matrix Multiplication is O(n ^ 3) operation which is much slower then replicating data in memory.
On top of that, the second option allocate more data in memory of the size of the output.
In the case above you create a vector which the outer multiplication is faster yet still not as memory operation.
MATLAB doesn't use the knowledge all vector elements are 1, hence you multiply each element of x by 1 m times.
Both operations will be mainly memory bounded, yet more efficient, fast and direct method would be going with repmat().
The question is, what you do afterwards?
Because you may not need repmat().

use a small matrix to generate a larger matrix by concatenating it again and again

I have a 40x43 matrix and I would like to use this matrix a building block to generate larger matrix.
I want to generate a structure like the image attached and the building block is the 40x43 matrix. I tried using [A zeros(20,43); zeros(20,43) A] but as I had guessed, the horzcat did't work. I would ideally like to use this block 1000 times to extend the structure of matrix. Could anyone tell me an efficient way to concatenate the small matrix?
Try using kron. This performs what is known as the Kronecker product such that for two matrices A and B, the result is:
In this case, we can replicate what you want exactly by setting A to be the identity matrix of size 1000 x 1000 and B to be the matrix you want to replicate. However, to promote computational savings and memory usage, make sure you use the sparse version of the identity matrix. This will convert the output matrix to sparse form. If you want to replicate this 1000 times, you are creating a 40000 x 43000 matrix and this requires 13.76 GB of memory and you probably don't have enough memory available for this matrix. Since most of the elements are zero, use the sparse version instead:
N = 1000;
B = kron(speye(N), A);

how to generate a large square binary matrix in MATLAB

i need to generate a large square binary sparse matrix in MATLAB (about 100k x 100k). but i get the "out of memory" error.
Can anybody help?
A 100,000 x 100,000 matrix contains 10,000,000,000 doubles. At 8 bytes each, that's 80,000,000,000 bytes, i.e. about 74.5058 Gb.
I seriously doubt you have 80Gb of RAM (let alone, allocated only to matlab), so presumably you'll have to find another way to process your data in chunks.
EDIT Apologies, I only just noticed the sparse bit.
If you try to initialise your sparse matrix as sparse (zeros( 100000,100000)), this will fail for the above reason (i.e. you're asking octave / matlab to first store a 75Gb matrix of zeros, and only then convert it to a sparse matrix).
Instead, you should initialise your 100,000x100,000 sparse matrix like so:
s = sparse(100000,100000);
and then proceed to fill in its contents.
Assuming the number of nonzero elements in your sparse matrix is low enough that they can be handled easily with your system's memory, and that you have a way of filling in the necessary values you have in mind without allocating a big bad matrix first, then this should work fine.
Have a look at the sparse function for other ways of initialising a sparse matrix from data.
Try increasing the size of the swap file of your system.

Sparse Matrix Assignment becomes very slow in Matlab

I am filling a sparse matrix P (230k,290k) with values coming from a text file which I read line by line, here is the (simplified) code
while ...
C = textscan(text_line,'%d','delimiter',',','EmptyValue', 0);
line_number = line_number+1;
P(line_number,:)=C{1};
end
the problem I have is that while at the beginning the
P(line_number,:)=C{1};
statement is fast, after a few thousands lines become exterely slow, I guess because Matlab need to find the memory space to allocate every time. Is there a way to pre-allocate memory with sparse matrixes? I don't think so but maybe I am missing something. Any other advise which can speed up the operation (e.g. having a lot of free RAM can make the difference?)
There's a sixth input argument to sparse that tells the number of nonzero elements in the matrix. That's used by Matlab to preallocate:
S = sparse(i,j,s,m,n,nzmax) uses vectors i, j, and s to generate an
m-by-n sparse matrix such that S(i(k),j(k)) = s(k), with space
allocated for nzmax nonzeros.
So you could initiallize with
P = sparse([],[],[],230e3,290e3,nzmax);
You can make a guess about the number of nonzeros (perhaps checking file size?) and use that as nzmax. If it turns you need more nonzero elements in the end, Matlab will preallocate on the fly (slowly).
By far the fastest way to generate a sparse matrix wihtin matlab is to load all the values in at once, then generate the sparse matrix in one call to sparse. You have to load the data and arrange it into vectors defining the row and column indices and values for each filled cell. You can then call sparse using the S = sparse(i,j,s,m,n) syntax.

out of memory error when using diag function in matlab

I have an array of valued double M where size(M)=15000
I need to convert this array to a diagonal matrix with command diag(M)
but i get the famous error out of memory
I run matlab with option -nojvm to gain memory space
and with the optin 3GB switch on windows
i tried also to convert my array to double precision
but the problem persist
any other idea?
There are much better ways to do whatever you're probably trying to do than generating the full diagonal matrix (which will be extremely sparse).
Multiplying that matrix, which has 225 million elements, by other matrices will also take a very long time.
I suggest you restructure your algorithm to take advantage of the fact that:
diag(M)(a, b) =
M(a) | a == b
0 | a != b
You'll save a huge amount of time and memory and whoever is paying you will be happier.
This is what a diagonal matrix looks like:
Every entry except those along the diagonal of the matrix (the ones where row index equals the column index) is zero. Relating this example to your provided values, diag(M) = A and M(n) = An
Use saprse matrix
M = spdiags( M, 0, numel(M), numel(M) );
For more info see matlab doc on spdiags and on sparse matrices in general.
If you have an n-by-n square matrix, M, you can directly extract the diagonal elements into a row vector via
n = size(M,1); % Or length(M), but this is more general
D = M(1:n+1:end); % 1-by-n vector containing diagonal elements of M
If you have an older version of Matlab, the above may even be faster than using diag (if I recall, diag wasn't always a compiled function). Then, if you need to save memory and only need the diagonal of M and can get rid of the rest, you can do this:
M(:) = 0; % Zero out M
M(1:n+1:end) = D; % Insert diagonal elements back into M
clear D; % Clear D from memory
This should not allocate much more than about (n^2+n)*8 = n*(n+1)*8 bytes at any one time for double precision values (some will needed for indexing operations). There are other ways to do the above that might save a bit more if you need a (full, non-sparse) n-by-n diagonal matrix, but there's no way to get around that you'll need n^2*8 bytes at a minimum just to store the matrix of doubles.
However, you're still likely to run into problems. I'd investigate sparse datatypes as #user2379182 suggests. Or rework you algorithms. Or better yet, look into obtaining 64-bit Matlab and/or a 64-bit OS!