Error using zeros Out of memory - matlab

When I try running
Adj = zeros(x*y);
I am receiving the following error:
Error using zeros
Out of memory. Type HELP MEMORY for your options.
where x*y=37901. The occupancy of my PC storage is
I know the C drive doesn't have much space but 34.2 GB should be more than enough for creating a 37901*37901 matrix.
When I run the memory command this is what I got:
>> memory
Maximum possible array: 4825 MB (5.059e+09 bytes) *
Memory available for all arrays: 4825 MB (5.059e+09 bytes) *
Memory used by MATLAB: 12369 MB (1.297e+10 bytes)
Physical Memory (RAM): 12218 MB (1.281e+10 bytes)
* Limited by System Memory (physical + swap file) available.
How can I solve this issue? (I am using MATLAB 2017b)

Actually, coding side, variables are normally stored into memory (your computer RAM) rather than into hard disk space. That's what your error complains about... you don't have enough memory to store the variable you want to allocate.
The default numerical variable used by Matlab is double, which is used to represent double precision floating-point values and takes up 8 bytes of memory. Hence, you are trying to allocate:
37901 * 37901 * 8 = 11491886408 bytes
~= 10.7 gigabytes
When you only have something like 11.9 gigabytes of available memory and Matlab is telling you that you can't allocate an array greater than 4.7 gigabytes. As a workaround, I suggest you to take a look at Tall Arrays, which are a Matlab feature tailored around handling very big data flows:
Tall arrays are used to work with out-of-memory data that is backed by
a datastore. Datastores enable you to work with large data sets in
small chunks that individually fit in memory, instead of loading the
entire data set into memory at once. Tall arrays extend this
capability to enable you to work with out-of-memory data using common
functions.
What is a Tall Array?
Since the data is not loaded into memory all at once, tall arrays can be arbitrarily large in the first dimension
(that is, they can have any number of rows). Instead of writing
special code that takes into account the huge size of the data, such
as with techniques like MapReduce, tall arrays let you work with large
data sets in an intuitive manner that is similar to the way you would
work with in-memory MATLAB® arrays. Many core operators and functions
work the same with tall arrays as they do with in-memory arrays.
MATLAB works with small chunks of the data at a time, handling all of
the data chunking and processing in the background, so that common
expressions, such as A+B, work with big data sets.
Benefits of Tall Arrays
Unlike in-memory arrays, tall arrays typically remain unevaluated until you request that the calculations
be performed using the gather function. This deferred evaluation
allows you to work quickly with large data sets. When you eventually
request output using gather, MATLAB combines the queued calculations
where possible and takes the minimum number of passes through the
data. The number of passes through the data greatly affects execution
time, so it is recommended that you request output only when
necessary.

Related

How should I store data in a recommendation engine?

I am developing a recommendation engine. I think I can’t keep the whole similarity matrix in memory.
I calculated similarities of 10,000 items and it is over 40 million float numbers. I stored them in a binary file and it becomes 160 MB.
Wow!
The problem is that I could have nearly 200,000 items.
Even if I cluster them into several groups and created similarity matrix for each group, then I still have to load them into memory at some point.
But it will consume a lot memory.
So, is there anyway to deal with these data?
How should I stored them and load into the memory while ensuring my engine respond reasonably fast to an input?
You could use memory mapping to access your data. This way you can view your data on disk as one big memory area (and access it just as you would access memory) with the difference that only pages where you read or write data are (temporary) loaded in memory.
If you can group the data somewhat, only smaller portions would have to be read in memory while accessing the data.
As for the floats, if you could do with less resolution and store the values in say 16 bit integers, that would also half the size.

loading large csv files in Matlab

I had csv files of size 6GB and I tried using the import function on Matlab to load them but it failed due to memory issue. Is there a way to reduce the size of the files?
I think the no. of columns are causing the problem. I have a 133076 rows by 2329 columns. I had another file which is of the same no. of rows but only 12 rows and Matlab could handle that. However, once the columns increases, the files got really big.
Ulitmately, if I can read the data column wise so that I can have 2329 column vector of 133076, that will be great.
I am using Matlab 2014a
Numeric data are by default stored by Matlab in double precision format, which takes up 8 bytes per number. Data of size 133076 x 2329 therefore take up 2.3 GiB in memory. Do you have that much free memory? If not, reducing the file size won't help.
If the problem is not that the data themselves don't fit into memory, but is really about the process of reading such a large csv-file, then maybe using the syntax
M = csvread(filename,R1,C1,[R1 C1 R2 C2])
might help, which allows you to only read part of the data at one time. Read the data in chunks and assemble them in a (preallocated!) array.
If you do not have enough memory, another possibility is to read chunkwise and then convert each chunk to single precision before storing it. This reduces memory consumption by a factor of two.
And finally, if you don't process the data all at once, but can implement your algorithm such that it uses only a few rows or columns at a time, that same syntax may help you to avoid having all the data in memory at the same time.

Largest Matrix Matlab Linprog can Support

I want to use MATLAB linprog to solve a problem, and I check it by a much smaller, much simpler example.
But I wonder if MATLAB can support my real problem, there may be a 300*300*300*300 matrix...
Maybe I should give the exact problem. There is a directed graph of network nodes, and I want to get the lowest utilization of the edge capacity under some constraints. Let m be the number of edges, and n be the number of nodes. There are mn² variables and nm² constraints. Unfortunately, n may reach 300...
I want to use MATLAB linprog to solve it. As described above, I am afraid MATLAB can not support it...Lastly the matrix must be sparse, can some way simplify it?
First: a 300*300*300*300 array is not called a matrix, but a tensor (or simply array). Therefore you can not use matrix/vector algebra on it, because that is not defined for arrays with dimensionality greater than 2, and you can certainly not use it in linprog without some kind of interpretation step.
Second: if I interpret that 300⁴ to represent the number of elements in the matrix (and not the size), it really depends if MATLAB (or any other software) can support that.
As already answered by ben, if your matrix is full, then the answer is likely to be no. 300^4 doubles would consume almost 65GB of memory, so it's quite unlikely that any software package is going to be capable of handling that all from memory (unless you actually have > 65 GB of RAM). You could use a blockproc-type scheme, where you only load parts of the matrix in memory and leave the rest on harddisk, but that is insanely slow. Moreover, if you have matrices that huge, it's entirely possible you're overlooking some ways in which your problem can be simplified.
If you matrix is sparse (i.e., contains lots of zeros), then maybe. Have a look at MATLAB's sparse command.
So, what exactly is your problem? Where does that enormous matrix come from? Perhaps I or someone else sees a way in which to reduce that matrix to something more manageable.
On my system, with 24GByte RAM installed, running Matlab R2013a, memory gives me:
Maximum possible array: 44031 MB (4.617e+10 bytes) *
Memory available for all arrays: 44031 MB (4.617e+10 bytes) *
Memory used by MATLAB: 1029 MB (1.079e+09 bytes)
Physical Memory (RAM): 24574 MB (2.577e+10 bytes)
* Limited by System Memory (physical + swap file) available.
On a 64-bit version of Matlab, if you have enough RAM, it should be possible to at least create a full matrix as big as the one you suggest, but whether linprog can do anything useful with it in a realistic time is another question entirely.
As well as investigating the use of sparse matrices, you might consider working in single precision: that halves your memory usage for a start.
well you could simply try: X=zeros( 300*300*300*300 )
on my system it gives me a very clear statement:
>> X=zeros( 300*300*300*300 )
Error using zeros
Maximum variable size allowed by the program is exceeded.
since zeros is a build in function, which only fills a array of the given size with zeros you can asume that handling such a array will not be possible
you can also use the memory command
>> memory
Maximum possible array: 21549 MB (2.260e+10 bytes) *
Memory available for all arrays: 21549 MB (2.260e+10 bytes) *
Memory used by MATLAB: 685 MB (7.180e+08 bytes)
Physical Memory (RAM): 12279 MB (1.288e+10 bytes)
* Limited by System Memory (physical + swap file) available.
>> 2.278e+10 /8
%max bytes avail for arrays divided by 8 bytes for double-precision real values
ans =
2.8475e+09
>> 300*300*300*300
ans =
8.1000e+09
which means I dont even have the memory to store such a array.
while this may not answer your question directly it might still give you some insight.

Does MATLAB execute basic array operations in constant space?

I am getting an out of memory error on this line of MATLAB code:
result = (A(1:xmax,1:ymax,1:zmax) .* B(2:xmax+1,2:ymax+1,2:zmax+1) +
A(2:xmax+1,2:ymax+1,2:zmax+1) .* B(1:xmax,1:ymax,1:zmax)) ./ C
where C is another array. This is on 32 bit MATLAB (I can't seem to get the 64 bit version at the moment, which would temporarily fix my problems).
The arrays result, A, B, and C are pre-initialized and never change size. It is then my guess that this computation is not being performed in constant space.
Is this correct? Is there a way to make it run or check if it is running in constant space?
These arrays of are approximate size (250, 250, 250).
If MATLAB does not run this in constant size, does anyone have any experience as to whether Octave or Julia or (insert similar language) does?
edit 1:
I eliminated excess arrays. There are 10 arrays that are 258 x 258 x 338, which corresponds to 1.67 GB. There are a bunch of other variables but they are much smaller. The calculation presented is simplified, the form of the calculation is:
R = (A(3Drange) .* B(3Drange) + A(new_3Drange) .* D(new_3Drange) + . . . ) ./ C
where the ranges generally just differ by a shift of plus or minus 1 or 2.
The output of memory command:
Maximum possible array: 669 MB (7.013e+08 bytes) *
Memory available for all arrays: 1541 MB (1.616e+09 bytes) **
Memory used by MATLAB: 2209 MB (2.316e+09 bytes)
Physical Memory (RAM): 8154 MB (8.550e+09 bytes)
* Limited by contiguous virtual address space available.
** Limited by virtual address space available.
Apparently I should be violating the second line. However, the code runs fine until the first operation that I actually do with the arrays. Perhaps MATLAB is being lazy and not allocating when I type:
A=zeros(xmax+2,ymax+2,zmax+2);
but still telling me in the workspace that the variable is allocated.
This code has worked before with smaller arrays. (edit: but it seems the actual memory size is the problem, not the size of each individual array).
The very curious thing to me is why it does not error during allocation, but instead errors during the first calculation.
edit 2:
I have confirmed that the loop is not running constant in space. There is about a .8 GB of memory being allocated during the calculation. Here is an image of resource usage while the command is being executed in a loop:
However, I tried breaking up the computation into multiple lines. I split the computation at each addition and added on each part in a new command, treating R as a accumulator. The result is that less memory is allocated at one time, but presumably more often. Here is the picture:
I am still curious as to why MATLAB doesn't want to execute this in constant space. I think it perhaps has something to do with the indexing being shifted - I am planning on investigating it more later and then putting this all together in an answer, but someone may beat me to it, which would be great also. Now, though, I can run the array size I was looking for and can finish my project.
I guess that most of the question has already been answered:
Does it operate in constant space?
No as you verified, it does not.
Why doesn't it operate in constant space?
Matlab claims to be fast at vectorized matrix operations, not so much emphasis is placed on memory efficiency.
What to do now?
Here are different options, the first one is preferred if possible, the other two are certainly possible.
Make it fit, for example by upgrading to 64 bit matlab or by not putting other stuf in your workspace
Work on parts of the matrix, so for example cut it in half
Dont use vectorization at all but make a simple for loop
If you don't vectorize, you will have a minimal space solution.

Maximum array size in objective C on iPhone?

I have a VERY large array (96,000 elements of type GLfloat). It was previously 24,000 elements, until I made a couple of changes. Now I'm getting a crash. I haven't done much to debug it yet, but when I noticed how ridiculously large one of my arrays was getting I thought it might be worth looking into. So, my only question is whether 96,000 elements (or 384,000 bytes) is too much for a single array?
That should be fine on the heap, but you should avoid allocations of that size on the stack. So malloc/free or new[]/delete[] is what you should use to create and destroy an array of that size.
If the device has low memory, you can expect requests for large amounts of memory to occasionally return NULL. There are applications (such as photo/image processing) which request allocations at tens of megabytes -- many times greater than your 384 KiB allocation.
There is no upper bound on the size of an array, save the amount of available RAM on the device.
I don't think it's too big. Some image resources would take up that much or more contiguous space without problem. For example, a 400x400px image would take about 160,000*4 = 640,000 bytes of memory. I think the problem is somewhere else.