Good use of memory - matlab

If i create a cell with 1000 matrices ( size of each matrix 800*1280), clearing each matrix after using it will speed up calculations ?
Example
A=cell(1000,1);
for i=1:1000
A{i}=rand(800,1280);
end
image=A{1};
image2=A{2}; % I will use image and image2 with other functions
A{1}=[];
A{2}=[];
EDIT
The real use of the cell will be like :
A=cell(1000,1);
parfor i=1:1000
A{i}=function_that_creates_image(800,1280); % image with size 800*1280 px
end
for i=1:number_of_images % number_of_images=1000 in this case
image1=A{1};
image2=A{2};
A{1}=[];
A{2}=[];
% image1 and image 2 will be used then in the next lines
%next lines of code
end
I noticed that calculating components of A in a parfor loop is faster than calculating each component for each loop inside the for

If you want to use less memory and speed up calculations, it's wiser to avoid using cells. Luckily, it's very easy in your case, since all your matrices are the same size, so you could use an ND-array.
A = zeros(800,1280,1000);
for k = 1:size(A,3)
A(:,:,k) = function_that_creates_image(800,1280);
end
image = A(:,:,1);
image2 = A(:,:,2); % I will use image and image2 with other functions
EDIT:
If you want to further process each image, I would save them to a file within the parfor, so you will have 1000 .mat files at the end of the first loop:
parfor k = 1:number_of_images
A = function_that_creates_image(800,1280);
save(['images_dir\image' num2str(k) '.mat'],'A');
end
then you can load them as needed for processing using load:
for k = 1:number_of_images-1
image1 = load(['images_dir\image' num2str(k) '.mat']);
image2 = load(['images_dir\image' num2str(k+1) '.mat'];
% do what you want with those images...
end
This way you only keep 2 images in memory each time, and on the next iteration they a replaced by the next images.

If you fit everything in memory (you need at least 16GB to hold data and do some work on parts of it, to work on the full beast at the same time you should be having 32GB), clearing these won't change anything at all. If you don't, I would assume/hope Matlab and Windows are smart enough to optimize which chunk is held in memory and which is put on the disk, so again deleting won't help. But you might not want to rely on that.
What you can do is to have A{i} = 'path-to-file';, then load it in memory just for the time when needed. Why do you even need to first load all the images and then do work on them one by one? It would be much better for memory to simply have image1 = rand(...);, image2 = rand(...); in the loop itself, and reuse these image1 and image2. No need to even have this A.
In general, tall arrays are your memory friendly solution you should be using if you want to have tons of data at the same time. https://www.mathworks.com/help/matlab/tall-arrays.html

Related

How can I read a bunch of TIFF files faster in Matlab?

I have to read in hundreds of TIFF files, perform some mathematical operation, and output a few things. This is being done for thousands of instances. And the biggest bottleneck is imread. Using PixelRegion, I read in only parts of the file, but it is still very slow.
Currently, the reading part is here.
Can you suggest how I can speed it up?
for m = 1:length(pfile)
if ~exist(pfile{m}, 'file')
continue;
end
pConus = imread(pfile{m}, 'PixelRegion',{[min(r1),max(r1)],[min(c1),max(c1)]});
pEvent(:,m) = pConus(tselect);
end
General Speedup
The pixel region does not appear to change at each iteration. I'm not entirely sure if Matlab will optimize the min and max calls (though I'm pretty sure it won't). If you don't change them at each iteration, move them outside the for loop and calculate them once.
Parfor
The following solution assumes you have access to the parallel computing toolbox. I tested it with 10,840 tiffs, each image was 1000x1000 originally, but I only read in a 300x300 section of them. I am not sure how many big pConus(tselect) is, so I just stored the whole 300x300 image.
P.S. Sorry about the formatting. It refuses to format it as a block of code.
Results based on my 2.3 GHz i7 w/ 16GB of ram
for: 130s
parfor: 26s + time to start pool
% Setup
clear;clc;
n = 12000;
% Would be faster to preallocate this, but negligeble compared to the
% time it takes imread to complete.
fileNames = {};
for i = 1:n
name = sprintf('in_%i.tiff', i);
% I do the exist check here, assuming that the file won't be touched in
% until the program advances a files lines.
if exist(name, 'file')
fileNames{end+1} = name;
end
end
rows = [200, 499];
cols = [200, 499];
pics = cell(1, length(fileNames));
tic;
parfor i = 1:length(fileNames)
% I don't know why using the temp variable is faster, but it is
temp = imread(fileNames{i}, 'PixelRegion', {rows, cols});
pics{i} = temp;
end
toc;

read 2d grey images and combine them to 3d matrix

I have some problems with matlab 2015a Win10 X64 16GB Ram.
There is a bunch of images (1280x960x8bit) and I want to load them into a 3d matrix. In theory that matrix should store ~1.2GB for 1001 images.
What I have so far is:
values(:,:,:)= zeros(960, 1280, 1001, 'uint8');
for i = Start:Steps:End
file = strcat(folderStr, filenameStr, num2str(i), '.png');
img = imread(file);
values(:,:,i-Start+1) = img;
end
This code is working for small amount of images but using it for all 1001 images I get "Out of memory" error.
Another problem is the speed.
Reading 50 images and save them takes me ~2s, reading 100 images takes ~48s.
What I thought this method does is allocating the memory and change the "z-elements" of the matrix picture by picture. But obviously it holds more memory than needed to perform that single task.
Is there any method I can store the grey values of a 2d picture sequence to a 3d matrix in matlab without wasting that much time and ressources?
Thank you
The only possibility I can see is that your idexes are bad. But I can only guess because the values of start step and End are not given. If End is 100000000, Start is 1 and step is 100000000, you are only reading 2 images, but you are accessing values(:,:,100000000) thus making the variable incredibly huge. That is, most likely, your problem.
To solve this, create a new variable:
imagenames=Start:Step:End; %note that using End as a variable sucks, better ending
for ii=1:numel(imagenames);
file = strcat(folderStr, filenameStr, num2str(imagenames(ii)), '.png');
img = imread(file);
values(:,:,ii) = img;
end
As Shai suggests, have a look to fullfile for better filename accessing

For Loop not advancing

I'm trying to read in a large number of jpg files using a for loop. But for some reason, the k index is not advancing. I just get A as a 460x520x3 uint8. Am I missing something?
My goal with this code is to convert all the jpg images to the same size. Since I haven't been able to advance through the images, I can't quite tell if I'm doing it right.
nFrames = length(date); % Number of frames.
for k = 1:nFrames-1 % Number of days
% Set file under consideration
A = imread(['map_EUS_' datestr(cell2mat(date_O3(k)),'yyyy_mm_dd') '_O3_MDA8.jpg']);
% Size of existing image A.
[rowsA, colsA, numberOfColorChannelsA] = size(A);
% Read in and get size of existing image B (the next image).
B = imread(['map_EUS_' datestr(cell2mat(date_O3(k+1)),'yyyy_mm_dd') '_O3_MDA8.jpg']);
[rowsB, colsB, numberOfColorChannelsB] = size(B);
% Size of B does not match A, so resize B to match A's size.
B = imresize(B, [rowsA colsA]);
eval(['print -djpeg map_EUS_' datestr(cell2mat(date_O3(k)),'yyyy_mm_dd') '_O3_MDA8_test.jpg']);
end
end
As you use imread to read the image in, it makes sense to use imwrite to write it out, rather than print/eval (also you should always think twice about using eval in MATLAB, and then think again).
You could also speed up this code somewhat - you just want to resize all the images to the size of whichever the first one is, so you don't need to keep reading in images to measure the size of them. Use imfinfo to get the size, then read in, resize, and write out B only.

Recursive loop optimization

Is there a way to rewrite my code to make it faster?
for i = 2:length(ECG)
u(i) = max([a*abs(ECG(i)) b*u(i-1)]);
end;
My problem is the length of ECG.
You should pre-allocate u like this
>> u = zeros(size(ECG));
or possibly like this
>> u = NaN(size(ECG));
or maybe even like this
>> u = -Inf(size(ECG));
depending on what behaviour you want.
When you pre-allocate a vector, MATLAB knows how big the vector is going to be and reserves an appropriately sized block of memory.
If you don't pre-allocate, then MATLAB has no way of knowing how large the final vector is going to be. Initially it will allocate a short block of memory. If you run out of space in that block, then it has to find a bigger block of memory somewhere, and copy all the old values into the new memory block. This happens every time you run out of space in the allocated block (which may not be every time you grow the array, because the MATLAB runtime is probably smart enough to ask for a bit more memory than it needs, but it is still more than necessary). All this unnecessary reallocating and copying is what takes a long time.
There are several several ways to optimize this for loop, but, surprisingly memory pre-allocation is not the part that saves the most time. By far. You're using max to find the largest element of a 1-by-2 vector. On each iteration you build this vector. However, all you're doing is comparing two scalars. Using the two argument form of max and passing it two scalar is MUCH faster: 75+ times faster on my machine for large ECG vectors!
% Set the parameters and create a vector with million elements
a = 2;
b = 3;
n = 1e6;
ECG = randn(1,n);
ECG2 = a*abs(ECG); % This can be done outside the loop if you have the memory
u(1,n) = 0; % Fast zero allocation
for i = 2:length(ECG)
u(i) = max(ECG2(i),b*u(i-1)); % Compare two scalars
end
For the single input form of max (not including creation of random ECG data):
Elapsed time is 1.314308 seconds.
For my code above:
Elapsed time is 0.017174 seconds.
FYI, the code above assumes u(1) = 0. If that's not true, then u(1) should be set to it's value after preallocation.

Loading multiple images in MATLAB

Here is the desired workflow:
I want to load 100 images into MATLAB workspace
Run a bunch of my code on the images
Save my output (the output returned by my code is an integer array) in a new array
By the end I should have a data structure storing the output of the code for images 1-100.
How would I go about doing that?
If you know the name of the directory they are in, or if you cd to that directory, then use dir to get the list of image names.
Now it is simply a for loop to load in the images. Store the images in a cell array. For example...
D = dir('*.jpg');
imcell = cell(1,numel(D));
for i = 1:numel(D)
imcell{i} = imread(D(i).name);
end
BEWARE that these 100 images will take up too much memory. For example, a single 1Kx1K image will require 3 megabytes to store, if it is uint8 RGB values. This may not seem like a huge amount.
But then 100 of these images will require 300 MB of RAM. The real issue comes about if your operations on these images turn them into doubles, then they will now take up 2.4 GIGAbytes of memory. This will quickly eat up the amount of RAM you have, especially if you are not using a 64 bit version of MATLAB.
Assuming that your images are named in a sequential way, you could do this:
N = 100
IMAGES = cell(1,N);
FNAMEFMT = 'image_%d.png';
% Load images
for i=1:N
IMAGES{i} = imread(sprintf(FNAMEFMT, i));
end
% Run code
RESULT = cell(1,N);
for i=1:N
RESULT{i} = someImageProcessingFunction(IMAGES{i});
end
The cell array RESULT then contains the output for each image.
Be aware that depending on the size of your images, prefetching the images might make you run out of memory.
As many have said, this can get pretty big. Is there a reason you need ALL of these in memory when you are done? Could you write the individual results out as files when you are done with them such that you never have more than the input and output images in memory at a given time?
IMWRITE would be good to get them out of memory when you are done.