Matlab loading slowdown in loop - matlab

After a long time searching for an answer and not having found one, my last resort is asking a new question. I create multiple (N=1000) -mat v-6 files that are each about 100 MB in size and contain a single matrix. In a separate part of my code, I need to load in each file. The problem I'm running into is that loading the files in suddenly becomes very time consuming around file number 600 and I'm not sure why its happening. Thanks in advance for any suggestions.
I'm using Matlab R2014b on a Mac with 16GB of ram.
Sample code
c=nan(1,1000)
for h=1:1000
tic
filename=[basefilename,'_',num2str(h),'.mat'];
transition=load(filename,'P')
c(h)=toc;
end
Here is an image of the recorded loading times using the exact code above

Related

How to write "Big Data" to a text file using Matlab

I am getting some readings off an accelerometer connected to an Arduino which is in turn connected to MATLAB through serial communication. I would like to write the readings into a text file. A 10 second reading will write around 1000 entries that make the text file size around 1 kbyte.
I will be using the following code:
%%%%%// Communication %%%%%
arduino=serial('COM6','BaudRate',9600);
fopen(arduino);
fileID = fopen('Readings.txt','w');
%%%%%// Reading from Serial %%%%%
for i=1:Samples
scan = fscanf(arduino,'%f');
if isfloat(scan),
vib = [vib;scan];
fprintf(fileID,'%0.3f\r\n',scan);
end
end
Any suggestions on improving this code ? Will this have a time or Size limit? This code is to be run for 3 days.
Do not use text files, use binary files. 42718123229.123123 is 18 bytes in ASCII, 4 bytes in a binary file. Don't waste space unnecessarily. If your data is going to be used later in MATLAB, then I just suggest you save in .mat files
Do not use a single file! Choose a reasonable file size (e.g. 100Mb) and make sure that when you get to that many amount of data you switch to another file. You could do this by e.g. saving a file per hour. This way you minimize the possible errors that may happen if the software crashes 2 minutes before finishing.
Now knowing the real dimensions of your problem, writing a text file is totally fine, nothing special is required to process such small data. But there is a problem with your code. You are writing a variable vid which increases over time. That may cause bad performance because you are not using preallocation and it may consume a lot of memory. I strongly recommend not to keep this variable, and if you need the dater read it afterwards.
Another thing you should consider is verification of your data. What do you do when you receive less samples than you expect? Include timestamps! Be aware that these timestamps are not precise because you add them afterwards, but it allows you to identify if just some random samples are missing (may be interpolated afterwards) or some consecutive series of maybe 100 samples is missing.

reading images from VideoReader gets progressively slower

I've been trying to to read an MP4 file using VideoReader. Matlab is able to read the images, but the further the frame is along the video, the more time it takes.
tic;I=read(v,1);toc
Elapsed time is 0.264011 seconds.
tic;I=read(v,2000);toc
Elapsed time is 32.859614 seconds.
Also, I'm not sure if this is related, but Matlab cannot determine the number of frames in the file:
v=VideoReader('S1140007 (~200 cubes, large).MP4');
Warning: Unable to determine the number of frames in this file.
I've tried using two versions R2012b and R2015a, and the problem persists.
On a different machine, however, the number of frames can be determined and the reading times don't get longer, so obviously there's something configured wrong on my machine.
I there a known solution for this problem (can this be related to codecs somehow?), or maybe an alternative method of reading one image at a time (readFrame is not relevant for my needs).
Any help would be appreciated,
Aviram
OK, so this is not exactly an answer, but a workaround...
It seems that to set the NumberofFrames property in the videoreader object created for a video with an undetermined number of frames, one needs to read the last frame using the following code (as mentioned in the documentation of VideoReader):
v=VideoReader('path.mp4');
l=read(v,inf);
This sets the number of frames in the video, and allows for indexing and quick reading of single frames from the video. However, this only works in matlab r2012b. In 2015a, the NumberofFrames property is set by the read(v,inf) trick, but the reading is still very time-consuming, for some reason.
I'm not sure why this happens, and as I've said, some of the other machines I've checked were able to read my files properly (but some didn't), so this is far from completed. It is not clear why it cannot determine the number of frames, or why there's any variability between computers and why in some versions the last(v, inf) works and in others only partially.

Executing VideoReader('movie.mp4') takes so long in Matlab

I am trying to read many video files from a database and process them. I am using Matlab and my problem is that when I want to read a 10 minutes long full HD video I should wait so much and my computer stops performing well. I use this command
VideoReader('movie.mp4')
I have seen that it takes 47 seconds to read a 30 seconds long video in the same format. I do not need to load all frames into my memory I just need 11 frames for each step of my process and really got stock here. Any help will be appreciated.
Also here is my output when I run this command
disp(videoObj);
output:
Summary of Multimedia Reader Object for 'movie.mp4'.
Video Parameters: 30.00 frames per second, RGB24 1280x720.
1482 total video frames available.
By the way I am running my code on Matlab R2014a and my OS is ubuntu 14.0.4.
Siavash,
The long loading time is because the entire file is scanned to determine the number of frames. This process is necessary to support frame indexed based access. In R2014b and higher, the frame counting during construction has been disabled. Additionally, you can seek to specific locations in the file using the CurrentTime property and use the hasFrame/readFrame methods for reading to avoid this performance penalty
Dinesh
I'm using version R2015b and I find the same slow processing of mp4 files with the VideoReader and readFrame functions. However, I find that those functions perform much faster on an avi file than an mp4, so I first convert the mp4 to avi using an independent program from https://www.ffmpeg.org. I don't know why there's such a difference in speed between the two...perhaps someone from MATLAB can provide some insight into that question.

Why would saving to a folder called temp cause data loading to slow down in a for loop in Matlab?

IMPORTANT UPDATE
I just made the discovery that after restarting Matlab and the computer, this simplified code no longer reproduces the problem for me either... I am so sorry for taking up your time with a script that didn't work. However, the old problem still persists in my original script if I save anything in any folder (that I have tried) in the inner 'for' loop. For my purposes, I have worked around it by simply not make this save unless I absolutely need it. The original script has the following structure in terms of for loops and use of save or load:
load() % .mat files, size 365x92x240
for day = 1:365
load() % .mat files, size 8x92x240
for type = 1:17
load() % .mat files size 17x92x240
load() % .mat files size 92x240
for step 1:8
%only calculations
end
save() % .mat files size 8x92x240
end
save() % .mat files, size 8x92x240
end
% the load and saves outside the are in for loops too, but do not seem to affect the described behavior in the above script
load() % .mat files size 8x92x240
save() % .mat files size 2920x92x240
load()
save() % .mat files size 365x92x240
load()
save() % .mat files size 12x92x240
If run in full, the script saves approx. 10 Gb and loads approx. 2Gb of data.
The entire script is rather lengthy and makes a lot of saves and loads. It would be rather impractical too share all here before I have managed to reproduce the problem in a reduced version, unfortunately. As I frustratingly discovered that the very same code could behave differently from to time to time, it immediately got more tedious than anticipated to find a simplification that consistently reproduces the behavior. I will get back as soon as I am sure about a manageable code that produces the problem.
PREVIOUS PROBLEM DESCRIPTION
(NB. The code below does not for sure reproduce the described problem.):
I just learnt the hard way that, in Matlab, you can't name a saving folder to temp in a for loop without slowing down data loading in the next round of the loop. My question is why?
If you are interested in reproducing the problem yourself, please see the code below. To run it, you will also need a matfile called anyData.mat to load and two folders for saving, one called temp and the other called temporary.
clear all;clc;close all;profile off;
profile on
tT= zeros(1,endDay+1);
tTD= zeros(1,endDay+1);
for day = 0:2;
tic
T = importdata('anyData.mat')
tT(day+1)=toc; %loading time in seconds
tic
TD = importdata('anyData.mat')
tTD(day+1)=toc;
for type = 0:1
saveFile = ones(92,240);
save('AnyFolder\temporary\saveFile.mat', 'saveFile') % leads to fast data loading
%save('AnyFolder\temp\saveFile.mat', 'saveFile') %leads to slow data loading
end % end of type
end% end of day
profile off
profile report
plot(tT)
You will see in y-axis of the plot that data loading takes significantly longer time when you in the later for loop save to temp rather than temporary. Is there anyone out there who knows why this occurs?
There are two things here
Storage during a for loop is an expensive operation as it usually opens a file stream and closes it before it moves on. You might not be able to avoid this.
Second thing is speed of storage and its cache speed. Most likely programs use temp folder for its own temporary files and have a garbage collector or software looking after these to clean them. If you start opening and closing file stream to this folder you have to send a request to get exclusive write access to the folder. This again adds to the time.
If you are doing image processing operations and you have multiple images you can run into a bottle neck with writing to hard drive due to its speed, cache and current memory available to MATLAB.
I can't reproduce the problem, suspect it's system and data-size specific. But some general comments which could help you out of the predicament:
As pointed out by commenters and the above answers, file i/o within a double for loop can be extremely parasitic, especially in cases where you only need to access part of the data in the file, where other system operations delay the process, or where the data files are large enough to require virtual memory (windows) / swap space (linux) to even load them. In the latter case, you could be in a situation where you're moving a file from one part of the hard disk to another when you open it!
I assume that you're loading/saving because you don't have c.10GB of ram to hold everything in memory for computation. The actual problem is not described, so I can't be certain, but think you might find that the matfile class to be useful... TMW documentation. This is used to map directly to/from a mat file. This:
reduces file stream opening and closing IOPS
allows arbitrarily large variable sizes (governed by disk size, not memory)
allows you to read/write partially (i.e. write only some elements of an array without loading the whole file)
in the case that your mat file is too large to be held in memory, avoids loading it into swap space which would be extremely cumbersome.
Hope this helps.
Tom

Matlab save figure containing large number of elements as bitmap

I am trying to save a 3D figure that I have generated using scatter3 with the following command:
set(gcf,'PaperPositionMode','auto')
print -zbuffer -dtiff -r300 figure_name.tif
on Matlab running on a Mac.
Upon executing the command the CPU load increases but nothing happened. I have waited for about 24 hours to no avail. I have tried the same on a very well specced Windows workstation with
print -opengl -dtiff -r300 figure_name.tif
but that didn't make any difference.
Normally for figures that contain fewer data points this command works very well and produces decent output in a matter of seconds.
I can save the figure in .fig format but what I really need is a decent resolution image file. The figure contains about 1 million data points and it shows up on the screen without too much delay when I plot it. I have tried reducing the number of data points to 200,000 but this also does not work. For a plot with less than 40,000 data points it does work, no matter whether I am on my Windows (64bit, 48 GB RAM) or Mac (64bit, 4 GB RAM) system. However, I need at least 100,000 data points to illustrate what I want to show.
And no luck with this:
print(gcf,'-dpng','figure_name.png');
I also tried the Save As option in the figure GUI but that's not doing any better either.
Essentially I have to kill the Matlab task to stop this task. ctrl + c does not help.
Does anyone have an idea how I could get my high resolution .tif file (can be any bitmap format really)?