Alternate method to kron

Alternate method to kron - matlab

I'm doing CDMA spreading in MATLAB. And I'm having an Out of Memory error in MATLAB despite upgrading my RAM, preallocating arrays, etc.
Is there an alternate method to kron (Kronecker tensor product) in MATLAB? Here is my code:
tempData = kron( Data, walsh);
Data is a M by 1 matrix and walsh (spread code) is a 8 by 1 matrix.
My Data is comprises of real and imaginary parts, e.g.: 0.000 + 1.000i or 1.000 + 0.000i in double format.

This call to kron is not memory intensive. I know, your problem seems so trivial. However, you don't tell us what is M. For very large values of M, you are simply trying to create too large of an array to fit in memory. It is very easy to forget that your computer is not infinitely large or infinitely fast. We get spoiled when we see "giga" in front of everything.
If you absolutely must do this for that value of M, then you probably need the 64 bit version of MATLAB, AND more memory will always help once you do that.
Another option is to make Data single precision, if you can afford the loss in precision. This will at least give you an extra factor of 2. In order to provide the best help, we need to know the size of M.

Related

repmat vs simple matrix multiplication in MATLAB

Let v be a row vector of length n. The goal is to create a matrix A with m rows that are all equal to v.
MATLAB has a function for this that is called repmat. Possible code would be
A = repmat(v,[m 1])
There is an alternative equally concise way using simple matrix operations
A = ones(m,1)*v
Is any of the two methods preferable for large m and n?

Lets compare them!
When testing algorithms 2 metrics are important: time, and memory.
Lets start with time:
Clearly repmat wins!
Memory:
profile -memory on
for m=1000:1000:50000
f1=#()(repmat(v,[m 1]));
f2=#()(ones(m,1)*v);
ii=ii+1;
t1(ii)=timeit(f1);
t2(ii)=timeit(f2);
end
profreport
It seems that both take the same amount of memory. However, the profiler is known for not showing all the memory, so we can not fully trust it.
Still, it is clear that repmat is better

You should use repmat().
Matrix Multiplication is O(n ^ 3) operation which is much slower then replicating data in memory.
On top of that, the second option allocate more data in memory of the size of the output.
In the case above you create a vector which the outer multiplication is faster yet still not as memory operation.
MATLAB doesn't use the knowledge all vector elements are 1, hence you multiply each element of x by 1 m times.
Both operations will be mainly memory bounded, yet more efficient, fast and direct method would be going with repmat().
The question is, what you do afterwards?
Because you may not need repmat().

how to generate a large square binary matrix in MATLAB

i need to generate a large square binary sparse matrix in MATLAB (about 100k x 100k). but i get the "out of memory" error.
Can anybody help?

A 100,000 x 100,000 matrix contains 10,000,000,000 doubles. At 8 bytes each, that's 80,000,000,000 bytes, i.e. about 74.5058 Gb.
I seriously doubt you have 80Gb of RAM (let alone, allocated only to matlab), so presumably you'll have to find another way to process your data in chunks.
EDIT Apologies, I only just noticed the sparse bit.
If you try to initialise your sparse matrix as sparse (zeros( 100000,100000)), this will fail for the above reason (i.e. you're asking octave / matlab to first store a 75Gb matrix of zeros, and only then convert it to a sparse matrix).
Instead, you should initialise your 100,000x100,000 sparse matrix like so:
s = sparse(100000,100000);
and then proceed to fill in its contents.
Assuming the number of nonzero elements in your sparse matrix is low enough that they can be handled easily with your system's memory, and that you have a way of filling in the necessary values you have in mind without allocating a big bad matrix first, then this should work fine.
Have a look at the sparse function for other ways of initialising a sparse matrix from data.

Try increasing the size of the swap file of your system.

Matlab: 'Block' matrix multiplication without resorting to repmat

I have a matrix A (of dimensions mxn) and vector b (of dimensions nx1).
I would like to construct a vector which is _repmat_((A*b),[C 1]), where C = n/m . I am using a lot of data and therefore n~100000 and C~10.
As you can see this is really block matrix multiplication without having to explicitly create the full A block matrix (dimensions nxn) as this easily exceeds available memory.
A is sparse and has already been converted using the function _sparse()_.
Is there a better way of doing this? (Considering speed and memory
footprint trade-off, I'd rather have a smaller memory footprint)
Usually if I was doing elementwise calculations, I would use bsxfun instead of using repmat to minimise memory footprint. As far as I know there is no equivalent bsxfun for matrix multiplication?

It looks like it is time for you to REALLY learn to use sparse matrices instead of wondering how to do this otherwise.
SPARSE block diagonal matrices do NOT take up a lot of memory if you create them correctly. I'll use my blktridiag function, which actually creates block tridiagonal matrices. Here I've used it to create random block diagonal matrices. I've set the off-diagonal elements to zero, so it really is block diagonal.
A = rand(3,3,100000);
tic,M = blktridiag(A,zeros(3,3,99999),zeros(3,3,99999));toc
Elapsed time is 0.478068 seconds.
And, while it is not small, the memory required is not that much more than twice the meemory required to store the diagonal eleemnts themselves.
whos A M
Name Size Bytes Class Attributes
A 3x3x100000 7200000 double
M 300000x300000 16800008 double sparse
Here about 17 megabytes, compared to 7 megabytes.
Note that blktridiag explicitly creates a sparse matrix directly.

Matlab large matrix multiplication limit

I'm trying to run the following code below, but Matlab freezes when my matrix gets beyond 10,000 columns. What can I do to fix this?
X = load('iris.mtx');
[n,d] = size(X);
%X=14000x128 double
%form RBF over the data:
nms = sum(X'.^2); %nms becomes 14000x1
%here is where the crash begins, for a smaller data size, like 10000x128, this part wont freeze
K = exp(-nms'*ones(1,n) -ones(n,1)*nms + 2*X*X');
Is this a limitation that I just have to accept? I need to be able to use this for matrix that are much bigger than what I'm currently using.

I would refer to this previously asked question about limitations in the size of Matlab matrices: Matrix size limitation in MATLAB.
The only limitations are the limitations of your hardware.
Not knowing more about what you need to do I can't suggest much further reading. However, as the size of your matrices gets to be larger than you have memory for this question addresses the issue of optimizing your operations. Efficient multiplication of very large matrices in MATLAB

Solving multiple linear systems using vectorization

Sorry if this is obvious but I searched a while and did not find anything (or missed it).
I'm trying to solve linear systems of the form Ax=B with A a 4x4 matrix, and B a 4x1 vector.
I know that for a single system I can use mldivide to obtain x: x=A\B.
However I am trying to solve a great number of systems (possibly > 10000) and I am reluctant to use a for loop because I was told it is notably slower than matrix formulation in many MATLAB problems.
My question is then: is there a way to solve Ax=B using vectorization with A 4x4x N and B a matrix 4x N ?
PS: I do not know if it is important but the B vector is the same for all the systems.

You should use a for loop. There might be a benefit in precomputing a factorization and reusing it, if A stays the same and B changes. But for your problem where A changes and B stays the same, there's no alternative to solving N linear systems.
You shouldn't worry too much about the performance cost of loops either: the MATLAB JIT compiler means that loops can often be just as fast on recent versions of MATLAB.

I don't think you can optimize this further. As explained by #Tom, since A is the one changing, there is no benefit in factoring the various A's beforehand...
Besides the looped solution is pretty fast given the dimensions you mention:
A = rand(4,4,10000);
B = rand(4,1); %# same for all linear systems
tic
X = zeros(4,size(A,3));
for i=1:size(A,3)
X(:,i) = A(:,:,i)\B;
end
toc
Elapsed time is 0.168101 seconds.

Here's the problem:
you're trying to perform a 2D operation (mldivide) on a 3d matrix. No matter how you look at it, you need reference the matrix by index which is where the time penalty kicks in... it's not the for loop which is the problem, but it's how people use them.
If you can structure your problem differently, then perhaps you can find a better option, but right now you have a few options:
1 - mex
2 - parallel processing (write a parfor loop)
3 - CUDA

Here's a rather esoteric solution that takes advantage of MATLAB's peculiar optimizations. Construct an enormous 4k x 4k sparse matrix with your 4x4 blocks down the diagonal. Then solve all simultaneously.
On my machine this gets the same solution up to single precision accuracy as #Amro/Tom's for-loop solution, but faster.
n = size(A,1);
k = size(A,3);
AS = reshape(permute(A,[1 3 2]),n*k,n);
S = sparse( ...
repmat(1:n*k,n,1)', ...
bsxfun(#plus,reshape(repmat(1:n:n*k,n,1),[],1),0:n-1), ...
AS, ...
n*k,n*k);
X = reshape(S\repmat(B,k,1),n,k);
for a random example:
For k = 10000
For loop: 0.122570 seconds.
Giant sparse system: 0.032287 seconds.
If you know that your 4x4 matrices are positive definite then you can use chol on S to improve the accuracy.
This is silly. But so is how slow matlab's for loops still are in 2015, even with JIT. This solution seems to find a sweet spot when k is not too large so everything still fits into memory.

I know this post is years old now, but I'll contribute my two cents anyway. You CAN put all of your A matricies into a bigger block diagonal matrix, where there will be 4x4 blocks on the diagonal of a big matrix. The right hand side will be all of your b vectors stacked on top of each other over and over. Once you set this up, it is represented as a sparse system, and can be efficiently solved with the algorithms mldivide chooses. The blocks are numerically decoupled, so even if there are singular blocks in there, the answers for the nonsingular blocks should be right when you use mldivide. There is a code that took this approach on MATLAB Central:
http://www.mathworks.com/matlabcentral/fileexchange/24260-multiple-same-size-linear-solver
I suggest experimenting to see if the approach is any faster than looping. I suspect it can be more efficient, especially for large numbers of small systems. In particular, if there are nice formulas for the coefficients of A across the N matricies, you can build the full left hand side using MATLAB vector operations (without looping), which could give you additional cost savings. As others have noted, vectorized operations aren't always faster, but they often are in my experience.