Matlab out of memory error behave differently in one and two dimensional arrays - matlab

Today I have the need to allocate a vector with size 100000 in Matlab. I try to do it simply using:
a=ones(100000);
which my Matlab angrily answered with:
Out of memory. Type HELP MEMORY for your options.
Which is strange since I have Matlab 64 bit running on a 64 bit machine with 8 GB RAM. I tried many of the "resolving out of memory errors in Matlab" recipe in SO or other places but no luck so far.
Now I'm more confused when something like:
a=ones(10000,10000);
Runs without problem in my machine.
Does this mean that Matlab have some mechanism to limit the number of elements of a vector in a single-dimensional space?

Today I have the need to allocate a vector with size 100000 in Matlab.
Now, as noted in the comments and such, the method you tried (a=ones(100000);) creates a 100000x100000 matrix, which is not what you want.
I would suggest you try:
a = ones(1, 100000);
Since that creates a vector rather than a matrix.

Arguments Matter
Calling Matlab's ones() or zeros() or magic() with a single argument n, creates a square matrix with size n-by-n:
>> a = ones(5)
a = 1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
Calling the same functions with 2 arguments (r, c) instead creates a matrix of size r-by-c:
>> a = ones(2, 5)
a = 1 1 1 1 1
1 1 1 1 1
This is all well documented in Matlab's documentation.
Size Matters Too
Doubles
Having said this, when you do a = zeros(1e6) you are creating a square matrix of size 1e6 * 1e6 = 1e12. Since these are doubles the total allocated size would be 8 * 1e12 Bytes which is circa (8 * 1e12) / 1024^3 = 7450.6GB. Do you have this much RAM on your machine?
Compare this with a = zeros(1, 1e6) which creates a column-vector of size 1 * 1e6 = 1e6, for a total allocated size of (8 * 1e6) / 1024^3 = 7.63MB.
Logicals
Logical values, on the other hand are boolean values, which can be set to either 0 or 1 representing False or True. With this in mind, you can allocate matrices of logicals using either false() or true(). Here the same single-argument rule applies, hence a = false(1e6) creates a square matrix of size 1e6 * 1e6 = 1e12. Matlab today, as many other programming languages, stores bit values, such as booleans, into single Bytes. Even though there is a clear cost in terms of memory usage, such a mechanism provides significant performance improvements. This is because it is accessing single bits is a slow operation.
The total allocated size of our a = false(1e6) matrix would therefore be 1e12 Bytes which is circa 1e12 / 1024^3 = 931.32GB.

Well the first declaration tries to build a matrix of 1000000x1000000 ones. That would be ~931 GB.
The second tries to declare a matrix of 10000 x 10000. That would be ~95MB.
I assumed each one is stored on a byte. If they use floats, than the requested memory size will be 4 times larger.

Related

efficiently use the memory of GPU in matlab

I am using GPU for computation in matlab. And I keep on getting Out of memory problem.
So I think I could convert some of my variables from double, which is the default type of matlab, to single. Then I did the following experiment
A = gpuArray([1,2,3])
A =
1 2 3
whos A
Name Size Bytes Class
A 1*3 4 gpuArray
B = gpuArray(single([1,2,3]))
B =
1*3 gpuArray single row vector
1 2 3
whos B
Name Size Bytes Class
B 1*3 4 gpuArray
Now I am a little bit confusing. On one hand, it does show me that B is a 1*3 gpuArray single row vector. However, on the other hand, the whos command shows no difference between A and B.
I am wondering if this double to single conversion will indeed help me reduce the memory usage of my GPU in matlab. Basically, my question is: when I move 2 variables on cpu, one is double and the other is single, to gpu, do they consume same amount of memory of GPU in matlab? whos command shows no difference.
Note the following:
A = gpuArray([1:1000])
whos A
Name Size Bytes Class Attributes
A 1x1000 4 gpuArray
Interesting! Only 4 bytes!
But this has an easy explanation: whos is only giving you the size of the variable on CPU RAM. Its 4 bytes because its just a memory address, not the data itself. The data is on the GPU, and it can not "easily" be accessed by the CPU.
Answering your question: Yes, single will take half of the memory of double on the GPU.

What's the maximum length of matrix I can store in Matlab

I am trying to store a matrix of size 4 x 10^6, but the Matlab can't do it when running it, it's like it can't store a matrix with that size or I should use another way to store. The code is as below:
matrix = [];
for j = 1 : 10^6
x = randn(4,1);
matrix = [matrix x];
end
The problem it still running for long time and can't finish it, however when I remove the line matrix = [matrix x]; , it finishes the loop very quickly. So what I need is to have the matrix in file so that I can use it wherever I need.
It is determined by your amount of available RAM. If you store double values, like here, you require 64 bits per number. Thus, storing 4M values requires 4*10^6*64 = 256M bits, which in turn is 32MB RAM.
A = rand(4,1e6);
whos A
Name Size Bytes Class Attributes
A 4x1000000 32000000 double
Thus you only cannot store this if you have less than 32MB RAM free.
The reason your code takes so long, is because you grow your matrix in place. The orange wiggles on the line matrix = [matrix x]; are not because the festive season is almost here, but because it is very bad practise to do this. As the warning tells you: preallocate your matrix. You know how large it will be, so just initialise it as matrix = zeros(4,1e6); instead of growing it.
Of course in this case you can simply do matrix = rand(4,1e6), which is even faster than looping.
For more information about preallocation see the official MATLAB documentation, this question (which I answered), or this one.

A memory-efficient replacement for meshgrid

Assume a simple example where I have indices
index_pos = [3,4,5];
index_neg = [1,2];
I would like to have a matrix:
result =
1 3
2 3
1 4
2 4
1 5
2 5
For this purpose I write the following code:
[X,Y] = meshgrid(index_pos,index_neg);
result = [Y(:) X(:)];
I think this is not a very efficient way. Also, this uses too much of my memory when I use big instances. I get the following error:
Error using repmat
Out of memory. Type "help memory" for your options.
Error in meshgrid (line 58)
xx = repmat(xrow,size(ycol));
Error in FME_funct (line 36)
[X,Y] = meshgrid(index_pos,index_neg);
Is there any 'clever' way to generate this matrix using less memory?
PS: I noticed that what I do is also given here. Most probably I have found this idea from there.
This depends entirely on how big your two variables are in relation to the amount of memory in your computer (plus the types of numbers you're using).
Try this:
res = zeros(numel(index_neg)*numel(index_pos), 2)
If that gives you an out-of-memory error then you don't have enough memory in your computer to store the result, regardless of the efficiency of the generator, so if the above errors, then you're stuck. If it does not error, then you could well write a looping algorithm that uses less temporary memory.
That said, by default MATLAB represents numbers with double precision, 8 bytes per number. If your index_ variables happen to contain, say, only positive integers (all less than 65,536) then you could use 16-bit unsigned integers. These are just 2 bytes per number and so take up 4 times less space than doubles. You can test this with:
res = zeros(numel(index_neg)*numel(index_pos), 2, 'uint16')
Finally you can find out how much memory is available to MATLAB with the memory command.
Here is a faster way to generate such a matrix. It avoids explicit temporary arrays by building the matrix directly in place,
res2 = [ reshape( bsxfun( #times , index_neg.' , ones(size(index_pos)) ) , [] , 1 ) , ...
reshape( bsxfun( #times , index_pos , ones(size(index_neg)).' ) , [] , 1 ) ] ;
Note that this require the same amount of memory to hold the main array, so it will not be possible to generate arrays larger than with your method (which fails at the meshgrid stage). This maximum size is ultimately dictated by the amount of RAM available to your system.

Matlab's sparse function explanation

this is my first time posting anything so please be nice!
I'm studying a code about a random walker algorithm and i got lost with the use of sparse to make the sparse laplacian matrix of a point and edge set. I'm planning to make my own code of the sparse function, but i'm having problems understanding how it works and the output of it so any help would be perfect.
Thank you all !
A sparse matrix is a special type of "matrix" in matlab, which is conceptually equivalent to a normal matrix, but works differently 'under the hood'.
They are called "sparse", because they are usually used in situations where one would expect most elements of the matrix to contain zeros, and only a few non-zero elements.
The advantage of using this type of special object is that the memory it takes to create such an object depends primarily on the number of nonzero elements contained, rather than the size of the "actual" matrix.
By contrast, a normal (full) matrix needs memory allocated relative to its size. So for instance, a 1000x1000 matrix of numbers (so called 'doubles') will take roughly 8Mb bytes to store (1 million elements at 8 bytes per 'double'), even if all the elements are zero. Observe:
>> a = zeros(1000,1000);
>> b = sparse(1000,1000);
>> whos
Name Size Bytes Class Attributes
a 1000x1000 8000000 double
b 1000x1000 8024 double sparse
Now, assign a value to each of them at subscripts (1,1) and see what happens:
>> a(1,1) = 1 % without a semicolon, this will flood your screen with zeros
>> b(1,1) = 1
b =
(1,1) 1
As you can see, the sparse matrix only keeps track of nonzero values, and the zeros are 'implied'.
Now lets add some more elements:
>> a(1:100,1:100) = 1;
>> b(1:100,1:100) = 1;
>> whos
Name Size Bytes Class Attributes
a 1000x1000 8000000 double
b 1000x1000 168008 double sparse
As you can see, the allocated memory for a hasn't changed, because the size of the overall array hasn't changed. Whereas for b, because it now contains more nonzero values, it takes up more space in memory.
In general most sparse matrices should work with the same operations as normal matrices; the reason for this is that most 'normal' functions are explicitly defined to also accept sparse matrices, but treat them differently under the hood (i.e. they try to arrive at the same result, but using a different approach internally to do so, one that is more suitable to sparse matrices). e.g.:
>> c = sum(a(:))
c =
10000
>> d = sum(b(:))
d =
(1,1) 1000000
You can 'convert' a full matrix directly to a sparse one with the sparse command, and a sparse matrix back to a "full" matrix with the full command:
>> sparse(c)
ans =
(1,1) 10000
>> full(d)
ans =
1000000

SOM out of memory in MATLAB

I am trying to use SOM to learn 80000X10 samples (each sample is a vector of size 10). But I can't even configure 8x8 net with 10000X1 samples. It throws "out of memory" error.
Here is my code (data is 80000X10 matrix):
net=selforgmap([8 8])
net=configure(net,data(1:10000,1))
Matlab help: "Unconfigured networks are automatically configured and initialized the first time train is called."
Even for 8000X1 dataset, it takes a lot of time. I noticed a huge numWeightElements: 512000 in net variable (8*8*8000=512000). The weights should be 8*8. SOM training algorithm shouldn't use this much memory. What is wrong?
The output of memory command:
>> memory
Maximum possible array: 3014 MB (3.160e+009 bytes)
Memory available for all arrays: 3014 MB (3.160e+009 bytes)
Memory used by MATLAB: 1154 MB (1.210e+009 bytes)
Physical Memory (RAM): 4040 MB (4.236e+009 bytes)
I think your configuring wrong the input structure. Each input vector must be a column and not a row. Quote from this "Clustering Data - MATLAB & Simulink"
To define a clustering problem, simply arrange Q input vectors to be
clustered as columns in an input matrix (see "Data Structures"
for a detailed description of data formatting for static and time
series data). For instance, you might want to cluster this set of 10
two-element vectors:
inputs = [7 0 6 2 6 5 6 1 0 1; 6 2 5 0 7 5 5 1 2 2]
As you can see each input vector is a column. You have 10 two element input vectors as a 2x10 array.