Matlab memory problems - matlab

I have a memory issue with a simple Matlab code I am running. When I run the following loop with the function listed, Matlab can go up to image 50 and after that it runs out of memory and ends. I need to run about 400 images and save them as a file on my hard drive for inspection. As you see in the function I am trying to clear everything but the memory still continues to grow. I have tried this Code with Matlab 13a and Matlab 14a and I have the same problem.
Here is the function (image is a jp2 image)
function plot1(image,boundary)
image1=imread(image)
h=figure
imshow(image1)
rectangle('Position',boundary)
saveas(h,[image '_newimage.jpg'],'jpg)
clear image1;clear h;cla reset;clear classes;
clear all;close all;
end
The above function is simple enough and I do not understand why the memory keeps increasing since I have cleared everything.
The program that calls this function looks like this (the image list and the xmin,xmax,width,height have been read from a list before in the program in a different loop):
for k=1:nf
boundary=[xmin(k) ymin(k) width(k) height(k)]
plot1(image{k},boundary)
end
Does anyone have any ideas about what is causing the memory increase?
Thank you for your time.

Related

Matlab Process Memory Leak Over 16 Days

I'm running a real-time data assimilation program written in Matlab, and there seems to be a slow memory leak. Over the course of about 16 days, the average memory usage has increased by about 40% (see the figure below) from about 1.1GB to 1.5GB. The program loops every 15 minutes, and there is a peak in memory usage for about 30 seconds during the data assimilation step (visible in the figure).
At the end of each 15 minute cycle, I'm saving the names, sizes, and types of all variables in the currently active workspace to a .mat file using the whos function. There are just over 100 variables, and after running the code for about 16 days, there is no clear trend in the amount of memory used by any of the variables.
Some variables are cleared at the end of each cycle, but some of them are not. I'm also calling close all to make sure there are no figures sitting in memory, and I made sure that when I'm writing ASCII files, I always fclose(fileID) the file.
I'm stumped...I'm wondering if anyone here has any suggestions about things I should look for or tools that could help track down the issue. Thanks in advance!
Edit, system info:
RHEL 6.8
Matlab R2014b
I figured out the problem. It turns out that the figure handles were hidden, and close('all') only works on figures that are visible. I assume they're hidden because the figures are created outside the scope of where I was trying to close the figures. The solution was to replace close('all') with close all hidden, which closes all figures including those with hidden handles.
I'll go ahead and restate what #John and #horchler mentioned in their comments, in case their suggestions can help people with similar issues:
Reusing existing figures can increase performance and reduce the potential for memory leaks.
Matlab has an undocumented memory profiler that could help debug performance related issues.
For processes that are running indefinitely, it's good practice to separate data collection/processing and product generation (figures etc). The first reads in and processes the data and saves it to a DB or file. The second allows you to "view/access/query" the data.
If you are calling compiled mex functions in your code, the memory leak could be coming from the Fortran or C/C++ code. Not cleaning up a single variable could cause a leak, and would explain linear memory growth.
The Matlab function whos is great for looking at the size in memory of each variable in the workspace. This can be useful for tracking down which variable is the culprit of a memory leak.
Thanks #John and #horchler!

Debugging Matlab avoids memory leak

I have a memory intensive Matlab script.
What puzzles me is that if I run this code it will leak memory at the very first iteration (out of the 46 expected). The leak will eventually become so big that it will require forcing Matlab to quit:
Trying to find the leak point, I set a breakpoint at the first line in the loop but as I hit "Continue" the execution ran through the first loop and stopped again at the breakpoint and produced no leak. Removing the breakpoint and continuing from that point reintroduces the leak.
Using the breakpoint to execute the code one loop at the time avoids the leak and the code terminates with no issues (fig.2).
Now, I would like to:
1) understand whether this leak is due to something I introduced or whether it could be a Matlab specific issue,
2) get an idea of how to find the leak (I cannot use the debugger as it removes the problem).
I would love to provide the code but it is quite a big chunk (>100 lines), so my question is more about the general approach than the actual debugging of the specific issue.
Thanks for the suggestions.
My approach has been to isolate the portion of code that was causing the problem with printouts above each line of code so that before the leaky crash I could see where the execution stopped.
The culprit was a zeros(100k) line where I tried to pre-allocate a big matrix.
I tried executing the same line on a newer version of Matlab (2015b vs 2014b) and found that while the older version lets you instantiate big matrices (over ~ 50k by 50k) and freezes when it sucks all the memory, the newer version returns the following error:
Error using zeros
Requested 50000x50000 (18.6GB) array exceeds maximum array size preference. Creation of arrays greater than this limit may take a long time and cause MATLAB to become unresponsive.
See array size limit or preference panel for more
information.
In my case the limits for a NxN matrix are:
N > ~60000 on Matlab2014b on 16GB RAM
N >= 46341 on Matlab2015b on 12GB RAM
With the difference that my 2014 version lets me at least try to create them and collapses when they are too big and the 2015 version prevented me from trying at all.
The puzzling bit is that, on the 2014b version, if I debug the code the compilers lets the zeros(100k) line run and everything works just fine.
The problem appears again if I try to visualise the contents of the matrix in Matlab Variables Tab.

MATLAB out of memory on linux despite regular "clear all"

I am batch processing a bunch of files (~200) on MATLAB, in essence
for i = 1:n, process(i); end
where process(i) opens a file, reads it and writes out the output to another file. (I am not posting details about process here because it is hundreds of lines long and I readily admit I don't fully understand the code, having obtained it from someone else).
This runs out of memory after every dozen of files or so. Of course, on Linux, the memory function is not available so we have to figure it out "by hand". Well, I thought there is some memory leak, so let's issue a clear all after every run, i.e.
for i = 1:n, process(i); clear all; end
No luck, this still runs out of memory. At the point where this happens, who says there's just two small arrays in memory (<100 elements). Note that quitting MATLAB and restarting solves the problem, so the computer certainly has enough memory to process a single item.
Any ideas to help me detect where the error comes from would be welcome.
This is probable not the solution you are hoping for but as a workaround you could have a shell script that loops over several calls to Matlab.

Find Time and Memory after program running in MATLAB

Is it any way in matlab that after program is finished to run, find the memory and time?
Also, if the workplace is saved and then it is loaded again, is it possible to see the time and memory for it ?
Thanks.
For the time consumption, would the profiler work? It slows the execution a bit, but is nice for debugging. Otherwise try to enclose the section you want to time with tic-toc.
And for memory consumption there were, and still is I think, no really convenient way to do this, however, something may have happened here. This is how mathworks answered a few years ago. You can try whos, but that one only works inside the current scope. Also memory can be used to see matlabs total memory consumption.
The time taken for loading a file should be possible to see by enclosing it with the usual tic-toc command. The size of a saved file on disk can be seen using dir on the file, but the size could be different in matlab. I guess that the safest way is to check the size before saving if it will be loaded under the same execution and otherwise it may be convenient to log the size somehow.
Don't know if i got your question correctly, but if you need to trace the time your function takes there are two ways:
the functions
tic;
t=toc
work like a stopwatch, tic starts the counting and toc tells you how long passed since last tic.
if you need to do more in depth analysis of the times matlab also offers a profile function.
i suggest you go through matlab documentation on how to use it...
hope i helped.
S.
For execution time between code lines use:
tic;
toc;
t = toc;
disp(['Execution time: ' num2str(t)])
To know and show memory usage of variables you can use whos
whos
S = whos; % type struct variable containing all the info of the actual workspace
S.bytes
To calculate the total storage, you can make a loop
Memory = 0;
S = whos;
for k = 1:length(S)
Memory = Memory + S(k).bytes;
end
disp(['Total memory used by variables in storage (Bytes): ' num2str(Memory)])
You might prefer to see whos page in mathworks

Why would saving to a folder called temp cause data loading to slow down in a for loop in Matlab?

IMPORTANT UPDATE
I just made the discovery that after restarting Matlab and the computer, this simplified code no longer reproduces the problem for me either... I am so sorry for taking up your time with a script that didn't work. However, the old problem still persists in my original script if I save anything in any folder (that I have tried) in the inner 'for' loop. For my purposes, I have worked around it by simply not make this save unless I absolutely need it. The original script has the following structure in terms of for loops and use of save or load:
load() % .mat files, size 365x92x240
for day = 1:365
load() % .mat files, size 8x92x240
for type = 1:17
load() % .mat files size 17x92x240
load() % .mat files size 92x240
for step 1:8
%only calculations
end
save() % .mat files size 8x92x240
end
save() % .mat files, size 8x92x240
end
% the load and saves outside the are in for loops too, but do not seem to affect the described behavior in the above script
load() % .mat files size 8x92x240
save() % .mat files size 2920x92x240
load()
save() % .mat files size 365x92x240
load()
save() % .mat files size 12x92x240
If run in full, the script saves approx. 10 Gb and loads approx. 2Gb of data.
The entire script is rather lengthy and makes a lot of saves and loads. It would be rather impractical too share all here before I have managed to reproduce the problem in a reduced version, unfortunately. As I frustratingly discovered that the very same code could behave differently from to time to time, it immediately got more tedious than anticipated to find a simplification that consistently reproduces the behavior. I will get back as soon as I am sure about a manageable code that produces the problem.
PREVIOUS PROBLEM DESCRIPTION
(NB. The code below does not for sure reproduce the described problem.):
I just learnt the hard way that, in Matlab, you can't name a saving folder to temp in a for loop without slowing down data loading in the next round of the loop. My question is why?
If you are interested in reproducing the problem yourself, please see the code below. To run it, you will also need a matfile called anyData.mat to load and two folders for saving, one called temp and the other called temporary.
clear all;clc;close all;profile off;
profile on
tT= zeros(1,endDay+1);
tTD= zeros(1,endDay+1);
for day = 0:2;
tic
T = importdata('anyData.mat')
tT(day+1)=toc; %loading time in seconds
tic
TD = importdata('anyData.mat')
tTD(day+1)=toc;
for type = 0:1
saveFile = ones(92,240);
save('AnyFolder\temporary\saveFile.mat', 'saveFile') % leads to fast data loading
%save('AnyFolder\temp\saveFile.mat', 'saveFile') %leads to slow data loading
end % end of type
end% end of day
profile off
profile report
plot(tT)
You will see in y-axis of the plot that data loading takes significantly longer time when you in the later for loop save to temp rather than temporary. Is there anyone out there who knows why this occurs?
There are two things here
Storage during a for loop is an expensive operation as it usually opens a file stream and closes it before it moves on. You might not be able to avoid this.
Second thing is speed of storage and its cache speed. Most likely programs use temp folder for its own temporary files and have a garbage collector or software looking after these to clean them. If you start opening and closing file stream to this folder you have to send a request to get exclusive write access to the folder. This again adds to the time.
If you are doing image processing operations and you have multiple images you can run into a bottle neck with writing to hard drive due to its speed, cache and current memory available to MATLAB.
I can't reproduce the problem, suspect it's system and data-size specific. But some general comments which could help you out of the predicament:
As pointed out by commenters and the above answers, file i/o within a double for loop can be extremely parasitic, especially in cases where you only need to access part of the data in the file, where other system operations delay the process, or where the data files are large enough to require virtual memory (windows) / swap space (linux) to even load them. In the latter case, you could be in a situation where you're moving a file from one part of the hard disk to another when you open it!
I assume that you're loading/saving because you don't have c.10GB of ram to hold everything in memory for computation. The actual problem is not described, so I can't be certain, but think you might find that the matfile class to be useful... TMW documentation. This is used to map directly to/from a mat file. This:
reduces file stream opening and closing IOPS
allows arbitrarily large variable sizes (governed by disk size, not memory)
allows you to read/write partially (i.e. write only some elements of an array without loading the whole file)
in the case that your mat file is too large to be held in memory, avoids loading it into swap space which would be extremely cumbersome.
Hope this helps.
Tom