How to run the code for large variables - matlab

I have one code like below-
W = 3;
i = 4;
s = fullfact(ones(1,i)*(W + 1)) - 1;
p2 = unique(sort(s(sum(s,2) == i,:),2),'rows');
I can run this code only upto "i=11" but i want to run this code for upto "i=25".When i run these code for i=12 it shows error message "Out of Memory".
I need to keep these code as it is.How can i modify these code for larger value of "i"?
Matlab experts need your valuable suggestion.

Just wanting to do silly things is not enough. You are generating arrays that are simply too large to fit into memory.
See that the size of the matrix s is a function of i here. size(s) will be 2^(2*i) by i. (By the way, some will argue it is a bad idea to use i as a variable, which is normally sqrt(-1), for such variables.)
So when i = 4, s is only 256x4.
When i = 11, s is 4194304x11. This array takes 369098752 bytes of space, so 370 megabytes.
When i = 25, the array will be of size
2^50*25
ans =
2.8147e+16
Multiply that by 8 to get the memory needed. Something like 224 petabytes of memory! If you have that much memory, then send me a few terabytes of RAM. You won't miss them.

Yes, there are times when MATLAB runs out of memory. You can get the amount of memory available at any point of time by executing the following:
memory
However, I would suggest follow one of the strategies to reduce memory usage available here. Also, you might want to clear the variables which are not required in every iteration by
clear variable_name

Related

Matlab Horzcat - Out of memory

Any trick to avoid an out of memory error in matlab?
I am assuming that the reason it shows up is because matlab is very inefficient in using horzcat and actually needs to temporarily duplicate matrices.
I have a matrix A with size 108977555 x 25. I want to merge this with three vectors d, m and y with size 108977555 x 1 each.
My machine has 32GB ram, and the above matrice + vectors occupy 18GB.
Now I want to run the following command:
A = [A(:,1:3), d, m, y, A(:,5:end)];
But that yields the error:
Error using horzcat
Out of memory. Type HELP MEMORY for your options.
Any trick to do this merge?
Working with Large Data Sets. If you are working with large data sets, you need to be careful when increasing the size of an array to avoid getting errors caused by insufficient memory. If you expand the array beyond the available contiguous memory of its original location, MATLAB must make a copy of the array and set this copy to the new value. During this operation, there are two copies of the original array in memory.
Restart matlab, I often find it doesn't fully clean up its memory or it get's fragmented, leading to lower maximal array sizes.
Change your datatype (if you can). E.g. if you're only dealing with numbers 0 - 255, use uint8, the memory size will reduce by a factor 8 compared to an array of doubles
Start of with A already large enough (i.e. 108977555x27 instead of 108977555x25 and insert in place:
A(:, 4) = d;
clear d
A(:, 5) = m;
clear m
A(:, 6) = y;
Merge the data in one datatype to reduce total memory requirement, eg a date easily fits into one uint32.
Leave the data separated, think about why you want the data in one matrix in the first place and if that is really necessary.
Use C-code to do the data allocation yourself (only if you're really desperate)
Further reading: https://nl.mathworks.com/help/matlab/matlab_prog/memory-allocation.html
Even if you could make it using Gunther's suggestions, it will just occupy memory. Right now it takes more than half of available memory. So, what are you planning to do then? Even simple B = A+1 doesn't fit. The only thing you can do is stuff like sum, or operations on part of array.
So, you should consider going to tall arrays and other related big data concepts, which are exactly meant to work with such large datasets.
https://www.mathworks.com/help/matlab/tall-arrays.html
You can first try the efficient memory management strategies as mentioned on the official mathworks site : https://in.mathworks.com/help/matlab/matlab_prog/strategies-for-efficient-use-of-memory.html
Use Single (4 bytes) or some other smaller data type instead of Double (8 bytes) if your code can work with that.
If possible use block processing (like rows or columns) i.e. store blocks as separate mat files and load and access only those parts of the matrix which are required.
Use matfile command for loading large variables in parts. Perhaps something like this :
save('A.mat','A','-v7.3')
oldMat = matfile('A.mat');
clear A
newMat = matfile('Anew.mat','Writeable',true) %Empty matfile
for i=1:27
if (i<4), newMat.A(:,i) = oldMat.A(:,i); end
if (i==4), newMat.A(:,i) = d; end
if (i==5), newMat.A(:,i) = m; end
if (i==6), newMat.A(:,i) = y; end
if (i>6), newMat.A(:,i) = oldMat.A(:,i-2); end
end

How to do a median projection of a large image stack in Matlab

I have a large stack of 800 16bit gray scale images with 2048x2048px. They are read from a single BigTIFF file and the whole stack barely fits into my RAM (8GB).
Now I need do a median projection. That means I want to compute the median of each pixel across all 800 frames. The Matlab median function fails because there is not enough memory left make a copy of the whole array for the function call. What would be an efficient way to compute the median?
I have tried using a for loop to compute the median one pixel at a time, but this is still terribly slow.
Iterating over blocks, as #Shai suggests, may be the most straightforward solution. If you do have this problem frequently, you may want to consider converting the image to a mat-file, so that you can access the pixels as n-d array directly from disk.
%# convert to mat file
matObj = matfile('dest.mat','w');
matObj.data(2048,2048,numSlices) = 0;
for t = 1:numSlices
matObj.data(:,:,t) = imread(tiffFile,'index',t);
end
%# load a block of the matfile to take median (run as part of a loop)
medianOfBlock = median(matObj.data(1:128,1:128,:),3);
I bet that the distributions of the individual pixel values over the stack (i.e. the histograms of the pixel jets) are sparse.
If that's the case, the amount of memory needed to keep all the pixel histograms is much less than 2K x 2K x 64k: you can use a compact hash map to represent each histogram, and update them loading the images one at a time. When all updates are done, you go through your histograms and compute the median of each.
If you have access to the Image Processing Toolbox, Matlab has a set of tool to handle large images called Blockproc
From the docs :
To avoid these problems, you can process large images incrementally: reading, processing, and finally writing the results back to disk, one region at a time. The blockproc function helps you with this process.
I will try my best to provide help (if any), because I don't have an 800-stack TIFF image, nor an 8GB computer, but I want to see if my thinkings can form a solution.
First, 800*2048*2048*8bit = 3.2GB, not including the headers. With your 8GB RAM it should not be too difficult to store it at once; there might be too many programs running and chopping up the contiguous memories. Anyway, let's treat the problem as Matlab can't load it as a whole into the memory.
As Jonas suggests, imread supports loading a TIFF image by index. It also supports a PixelRegion parameter, so you can also consider accessing parts of the image by this parameter if you want to utilize Shai's idea.
I came up with a median algo that doesn't use all the data at the same time; it barely scans through a sequence of un-ordered data, one at each time; but it does keep a memory of 256 counters.
_
data = randi([0,255], 1, 800);
bins = num2cell(zeros(256,1,'uint16'));
for ii = 1:800
bins{data(ii)+1} = bins{data(ii)+1} + 1;
end
% clearvars data
s = cumsum(cell2mat(bins));
if find(s==400)
med = ( find(s==400, 1, 'first') + ...
find(s>400, 1, 'first') ) /2 - 1;
else
med = find(s>400, 1, 'first') - 1;
end
_
It's not very efficient, at least because it uses a for loop. But the benefit is instead of keeping 800 raw data in memory, only 256 counters are kept; but the counters need uint16, so actually they are roughly equivalent to 512 raw data. But if you are confident that for any pixel the same grayscale level won't count for more than 255 times among the 800 samples, you can choose uint8, and hence reduce the memory by half.
The above code is for one pixel. I'm still thinking how to expand it to a 2048x2048 version, such as
for ii = 1:800
img_data = randi([0,255], 2048, 2048);
(do stats stuff)
end
By doing so, for each iteration, you only need these kept in memory:
One frame of image;
A set of counters;
A few supplemental variables, with size comparable to one frame of image.
I use a cell array to store the counters. According to this post, a cell array can be pre-allocated while its elements can still be stored in memory non-contigously. That means the 256 counters (512*2048*2048 bytes) can be stored separately, which is quite reasonable for your 8GB RAM. But obviously my sample code does not make use of it since bins = num2cell(zeros(....

non vectorized to vectorized , matlab

I have two issues. Please see followinf code
it=0:0.01:360;
jt=0:0.01:270;
LaserS=zeros(size(it,2)*size(jt,2),2);
p=1;
for m=it
for n=jt
LaserS(p,:)=[m,n];
p=p+1;
end
end
It is very slow and also takes a lot of memory (about 7.7765e+009 bytes). So I can't run it. How can i improve it and solve memory issue.
I'm using win7 64 with 8Gb RAM.
What are you trying to do? 'reshape' should solve your problem.
LaserS=zeros(size(it,2)*size(jt,2),2);
JT=reshape(repmat(jt,[1,numel(it)]),1,numel(jt)*numel(it));
IT=reshape(repmat(it,[numel(jt),1]),1,numel(jt)*numel(it));
LaserS = [JT.', IT.'];
Pre-allocating array is going to save you memory hit. Otherwise there is not memory optimization here.
You can't reduce memory usage unless you use fewer values. Can it and jt step by 0.1 instead of 0.01?
Here's a way to build your result matrix without a loop.
LaserS = [rempat(it.', length(jt), 1), kron(ones(length(it), 1), jt.')];
This code seems to be doing "nothing" in the sense that, after it runs, you end up with the matrix LaserS with a shape of 972000000 x 2. If you really need these values to be loaded to memory at the same time, that is the size and there is not much to do about it.
What would be my first approach, which cannot be inferred directly from the code you posted, is that maybe you can achieve the overall goal of your program if you generate the matrix data "on the fly" while you perform further processing over LaserS.
Hope this helps!
This should do it:
it = it(:);
jt = jt(:);
jt = repmat(jt,size(it,1),1)
it = repmat(it',size(jt,1),1);
it = it(:);
LaserS = [it, jt]
In addition to the nice solutions presented here already, If you want to reduce memory, there's no reason to use double. You can use single and half the memory required. You can encode the step size 0.01 to a unit step (that is, it=uint16(0:1:36000) thus encoding the numbers as integers uint16, this will use only one quarter of the memory. etc...
Meshgrid will do this cleanly as well, perhaps with a reshape if you insist on having it in an (n*m)-by-2 matrix. But why do you want that? It seems like you're actually after something else, and it's possible that bsxfun(,it,jt') would do what you want.

How to speed up array concatenation?

Instead of concatening results to this, Is there other way to do the following, I mean the loop will persist but vector=[vector,sum(othervector)]; can be gotten in any other way?
vector=[];
while a - b ~= 0
othervector = sum(something') %returns a vector like [ 1 ; 3 ]
vector=[vector,sum(othervector)];
...
end
vector=vector./100
Well, this really depends on what you are trying to do. Starting from this code, you might need to think about the actions you are doing and if you can change that behavior. Since the snippet of code you present shows little dependencies (i.e. how are a, b, something and vector related), I think we can only present vague solutions.
I suspect you want to get rid of the code to circumvent the effect of constantly moving the array around by concatenating your new results into it.
First of all, just make sure that the slowest portion of your application is caused by this. Take a look at the Matlab profiler. If that portion of your code is not a major time hog, don't bother spending a lot of time on improving it (and just say to mlint to ignore that line of code).
If you can analyse your code enough to ensure that you have a constant number of iterations, you can preallocate your variables and prevent any performance penalty (i.e. write a for loop in the worst case, or better yet really vectorized code). Or if you can `factor out' some variables, this might help also (move any loop invariants outside of the loop). So that might look something like this:
vector = zeros(1,100);
while a - b ~= 0
othervector = sum(something);
vector(iIteration) = sum(othervector);
iIteration = iIteration + 1;
end
If the nature of your code doesn't allow this (e.g. you are iterating to attain convergence; in that case, beware of checking equality of doubles: always include a tolerance), there are some tricks you can perform to improve performance, but most of them are just rules of thumb or trying to make the best of a bad situation. In this last case, you might add some maintenance code to get slightly better performance (but what you gain in time consumption, you lose in memory usage).
Let's say, you expect the code to run 100*n iterations most of the time, you might try to do something like this:
iIteration = 0;
expectedIterations = 100;
vector = [];
while a - b ~= 0
if mod(iIteration,expectedIterations) == 0
vector = [vector zeros(1,expectedIterations)];
end
iIteration = iIteration + 1;
vector(iIteration) = sum(sum(something));
...
end
vector = vector(1:iIteration); % throw away uninitialized
vector = vector/100;
It might not look pretty, but instead of resizing the array every iteration, the array only gets resized every 100th iteration. I haven't run this piece of code, but I've used very similar code in a former project.
If you want to optimize for speed, you should preallocate the vector and have a counter for the index as #Egon answered already.
If you just want to have a different way of writing vector=[vector,sum(othervector)];, you could use vector(end + 1) = sum(othervector); instead.

MATLAB program takes more and more of my memory

I'm going to write a program in MATLAB that takes a function, sets the value D from 10 to 100 (the for loop), integrates the function with Simpson's rule (the while loop) and then displays it. Now, this works fine for the first 7-8 values, but then it takes longer time and eventually I run out of memory, and I don't understand the reason for this. This is the code so far:
global D;
s=200;
tolerance = 9*10^(-5);
for D=10:1:100
r = Simpson(#f,0,D,s);
error = 1;
while(error>tolerance)
s = 2*s;
error = (1/15)*(Simpson(#f,0,D,s)-r);
r = Simpson(#f,0,D,s);
end
clear error;
disp(r)
end
mtrw's comment probably already answers the question in part: s should be reinitialized inside the for loop. The posted code results in s increasing irreversibly every time the error was too large, so for larger values of D the largest s so far will be used.
Additionally, since the code re-evaluates the entire integration instead of reusing the previous integration from [0, D-1] you waste lots of resources unless you want to explicitly show the error tolerance of your Simpson function - s will have to increase a lot for large D to maintain the same low error (since you integrate over a larger range you have to sum up more points).
Finally, your implementation of Simpson could of course do funny stuff as well, which no one can tell without seeing it...