Matlab variable taking forever to save - matlab

I have a MATLAB variable that is a 3x6 cell array. One of the columns of the cell array holds at most 150-200 small RGB images, like 16x20 pixel size (again, at most). The rest of the columns are:
an equal number of labels that are strings of a 3 or 4 characters,
an image mask, which is about 350x200
3 integers
For some reason saving this object is taking a very long time, or at least for the size of the object. It has already been 10 minutes(which isn't too bad, but I plan on expanding the size of the object to hold several thousand of those small images) and MATLAB doesn't seem to be making any progress. In fact, when I open the containing directory of the variable, its size is cycling between 0 bytes to about 120kB. (i.e. it will increase to 120 in steps of 30 or 40 kB, then restart).
Is this normal behavior? Do MATLAB variables always take so long to save? What's going on here?
Mistake: I'm saving AllData, not my SVM variable. AllData has the same data as the SVM keeper, less the actual SVM itself and one integer.
What particular points of the code would be helpful to show for solving this? The code itself is a few hundred lines and broken up in several functions. What would be important to consider to troubleshoot this? When the variable is created? when it's saved? The way I create the smaller images?
Hate to be the noob who takes a picture of their desktop. but the machine I'm working has problems taking screenshots. Anyway, here it is
Alldata/curdata are just subsets of the 3x7 array... actually it's a 3x8, but the last is just an int.
Interesting side point: I interrupted the saving process and the variable seemed to save just fine. I trained a new svm on the saved data and it works just fine. I'd like to not do that in the future though.
Using whos:
Name Size Bytes Class Attributes
AllData 3x6 473300 cell
Image 240x352x3 253440 uint8
RESTOREDEFAULTPATH_EXECUTED 1x1 1 logical
SVMKeeper 3x8 2355638 cell
ans 3x6 892410 cell
curData 3x6 473300 cell
dirpath 1x67 134 char
im 240x352x3 1013760 single
s 1x1 892586 struct
Updates:
1.Does this always happen, or did you only do it once?
-It always happens
2.Does it take the same time when you save it to a different (local) drive?
-I will investigate this more when I get back to that computer
3.How long does it take to save a 500kb matrix to that folder?
-Almost instantaneous
4.And as asked above, what is the function call that you use?
-Code added below
(Image is a rgb image)
MaskedImage(:,:,1)=Image(:,:,1).*Mask;
MaskedImage(:,:,2)=Image(:,:,2).*Mask;
MaskedImage(:,:,3)=Image(:,:,3).*Mask;
MaskedImage=im2single(MaskedImage);
....
(I use some method to create a bounding box around my 16x20 image)
(this is in a loop of that occurs about 100-200 times)
Misfire=input('is this a misfire?','s');
if (strcmpi(Misfire,'yes'))
curImageReal=MaskedImage(j:j+Ybound,i:i+Xbound,:);
Training{curTrainingIndex}=curImageReal; %Training is a cell array of images
Labels{curTrainingIndex}='ncr';
curTrainingIndex=curTrainingIndex+1;
end
(the loop ends)...
SaveAndUpdate=input('Would you like to save this data?(say yes,definitely)','s');
undecided=1;
while(undecided)
if(strcmpi(SaveAndUpdate,'yes,definitely'))
AllData{curSVM,4}=Training;
AllData{curSVM,5}=Labels;
save(strcat(dirpath,'/',TrainingName),'AllData'); <---STUCK HERE
undecided=0;
else
DontSave=input('Im not going to save. Say YESNOSAVE to NOT SAVE','s')
if(strcmpi(DontSave,'yesnosave'))
undecided=0;
else
SaveAndUpdate=input('So... save? (say yes,definitely)','s');
end
end
end

It is a bit unclear if you are doing some custom file saving or not. If it is the first, I'm guessing that you have a really slow save loop going on, maybe some hardware issues. Try to save the data using MATLAB's save function:
tic
save('test.mat', 'AllData')
toc
if that works fine try to work your way from there e.g. saving one element at a time etc.
You can profile your code by using the profiler, open it with the command profile viewer and then type in the code, script or function that you want to profile in the input text field.

This isn't a great answer, but it seems that the problem was that I was saving the version of my image after I had converted it to a single. I don't know why this would cause such a dramatic slowdown (after removing this line of code it worked instantly) so if someone could edit my answer to shed more light on the situation, that would be appreciated.

Related

How should I alter my code for my "string of Text & Numbers to Morse code" converter for the code to be able to run?

push button callback to convert to Morse
Hi, I have a problem, I'm supposed to create a GUI in MATLAB which converts letter & numbers into Morse code but my code wouldn't run, the attached image link above is for the push button callback. Also it says that the 'Morse' underlined in red needs to be preallocated for speed as it changes size every loop iteration. How should I approach this? Thanks..
Also, should I include anything under my edit1 and edit2 callbacks? Since edit1 is just for entering the input of numbers and letters and edit2 is just to output the Morse code. Thanks again!
edit1 & edit2 callbacks
"Morse" changes size every loop iteration. First of all, let's define 2 variables.
Morse_1 = [];
Morse_2 = zeros(1,100);
(I'm taking the liberty of defining matrices instead of strings, but that's easier to explain this concept). You are basically saying that Morse_1 is a blank variable that can be filled, while Morse_2 has fixed dimensions. The dimensions of blank variables like Morse_1 (pardon me if I'm not using the correct names, but I think blank variable explains it quite well) are flexible. This means that doing
Morse_1(1,101) = 1
will work (Morse_1 will be a 101-dimensional vector with 100 zeros and a 1 at the 101st position). Doing
Morse_2(1,101) = 1
will work as well, but you might end up with too many unused elements if you largely overestimate the dimensions (e.g. zeros(1,1000) but your message actually only reaches a few hundred).
In your case, I'd use a blank variable, since you don't really know beforehand how long your coded message is going to be (even if you knew the number of characters in your original string, the coded message would be 5 times longer if it were all '9's than all 'e's). This warning is really useful when dealing with 1000x1000 matrices, but for processing strings I'd ignore it.
To sum it up, I'd use a blank variable if you have no idea how long it'll get, or if your code can't handle a variable length, or if you don't want to worry about calculating exactly how many elements are needed. On the other hand, I'd use fixed dimensions if your code needs a properly dimensioned array, or if you're working with very large arrays. For a lot of cases, though, you really won't notice the speed difference (filling a blank array might take 0.01s, while filling a fixed dimension one might take 0.001s. Unless you're doing this a thousand times (why??), it's literally unnoticeable).
Personally, I'd change the way this loop works using strrep() like this:
for i=1:length(alphabet) %alphabet = 26 letters+10 numbers+space, 37 characters in total
original_message = strrep(original_message,alphabet{i},morse_alphabet{i});
end
strrep(a,b,c) finds the substrings b inside a and replaces it with c. In your case, alphabet is the same as the dictionary chars, and morse_alphabet is the same as the dictionary code.
As for the callbacks, I don't really know about it, so I can't help you with that.

Manipulating large sets of data in Matlab, asking for advice on a few things, cells and numeric array operations, with performance in mind

This is a cross-post from here:
Link to post in the Mathworks community
Currently I'm working with large data sets, I've saved those data set as matlab files with the two biggest files being 9.5GB and 5.9GB.
These files contain a cell array each of 1x8 (this is done for addressibility and to prevent mixing up data from each of the 8 cells and I specifically wanted to avoid eval).
Each cell then contains a 3D double matrix, for one it's 1001x2002x201 and the other it is 2003x1001x201 (when process it I chop of 1 row at the end to get it to 2002).
Now I'm already running my script and processing it on a server (64 cores and plenty of RAM, matlab crashed on my laptop, as I need more than 12GB ram on windows). Nonetheless it still takes several hours to finish running my script and I still need to do some extra operations on the data which is why I'm asking advice.
For some of the large cell arrays, I need to find the maximum value of the entire set of all 8 cells, normally I would run a for loop to get the maximum of each cel and store each value in a temporay numeric array and then use the function max again. This will work for sure I'm just wondering if there's a better more efficient way.
After I find the maximum I need to do a manipulation over all this data as well, normally I would do something like this for an array:
B=A./maxvaluefound;
A(B > a) = A(B > a)*constant;
Now I could put this in a for loop, adress each cell and run this, however I'm not sure how efficient that would be though. Do you think there's a better way then a for loop that's not extremely complicated/difficult to implement?
There's one more thing I need to do which is really important, each cell as I said before is a slice (consider it time), while inside each slide is the value for a 3D matrix/plot. Now I need to integrate the data so that I get more slices. The reason I need to do this that I need to create slices/frames/plots to create a movie/gif. I'm planning on plotting the 3d data using scatter3 where this data is represented by color. I plan on using alpha values to make it see through so that one can actually see the intensity in this 3d plot. However I understand how to use griddata but apparently it's quite slow. Some of the other methods where hard to understand. Thus what would be the best way to interpolate these (time) slices in an efficient way over the different cells in the cell array? Please explain it if you can, preferably with an example.
I've added a pic for the Linux server info I'm running it on below, note I can not update the matlab version unfortunately, it's R2016a:
I've also attached part of my code to give a better idea of what I'm doing:
if (or(L03==2,L04==2)) % check if this section needs to be executed based on parameters set at top of file
load('../loadfilewithpathnameonmypc.mat')
E_field_650nm_intAll=cell(1,8); %create empty cell array
parfor ee=1:8 %run for loop for cell array, changed this to a parfor to increase speed by approximately 8x
E_field_650nm_intAll{ee}=nan(szxit(1),szxit(2),xres); %create nan-filled matrix in cell 1-8
for qq=1:2:xres
tt=(qq+1)/2; %consecutive number instead of spacing 2
T1=griddata(Xsall{ee},Ysall{ee},EfieldsAll{ee}(:,:,qq)',XIT,ZIT,'natural'); %change data on non-uniform grid to uniform gridded data
E_field_650nm_intAll{ee}(:,:,tt)=T1; %fill up each cell with uniform data
end
end
clear T1
clear qq tt
clear ee
save('../savelargefile.mat', 'E_field_650nm_intAll', '-v7.3')
end
if (L05==2) % check if this section needs to be executed based on parameters set at top of file
if ~exist('E_field_650nm_intAll','var') % if variable not in workspace load it
load('../loadanotherfilewithpathnameonmypc.mat');
end
parfor tt=1:8 %run for loop for cell array, changed this to a parfor to increase speed by approximately 8x
CFxLight{tt}=nan(szxit(1),szxit(2),xres); %create nan-filled matrix in cells 1 to 8
for qq=1:xres
CFs=Cafluo3D{tt}(1:lxq2,:,qq)'; %get matrix slice and tranpose matrix for point-wise multiplication
CFxLight{tt}(:,:,qq)=CFs.*E_field_650nm_intAll{tt}(:,:,qq); %point-wise multiple the two large matrices for each cell and put in new cell array
end
end
clear CFs
clear qq tt
save('../saveanotherlargefile.mat', 'CFxLight', '-v7.3')
end

MATLAB: making a histogram plot from csv files read and put into cells?

Unfortunately I am not too tech proficient and only have a basic MATLAB/programming background...
I have several csv data files in a folder, and would like to make a histogram plot of all of them simultaneously in order to compare them. I am not sure how to go about doing this. Some digging online gave a script:
d=dir('*.csv'); % return the list of csv files
for i=1:length(d)
m{i}=csvread(d(i).name); % put into cell array
end
The problem is I cannot now simply write histogram(m(i)) command, because m(i) is a cell type not a csv file type (I'm not sure I'm using this terminology correctly, but MATLAB definitely isn't accepting the former).
I am not quite sure how to proceed. In fact, I am not sure what exactly is the nature of the elements m(i) and what I can/cannot do with them. The histogram command wants a matrix input, so presumably I would need a 'vector of matrices' and a command which plots each of the vector elements (i.e. matrices) on a separate plot. I would have about 14 altogether, which is quite a lot and would take a long time to load, but I am not sure how to proceed more efficiently.
Generalizing the question:
I will later be writing a script to reduce the noise and smooth out the data in the csv file, and binarise it (the csv files are for noisy images with vague shapes, and I want to distinguish these shapes by setting a cut off for the pixel intensity/value in the csv matrix, such as to create a binary image showing these shapes). Ideally, I would like to apply this to all of the images in my folder at once so I can shift out which images are best for analysis. So my question is, how can I run a script with all of the csv files in my folder so that I can compare them all at once? I presume whatever technique I use for the histogram plots can apply to this too, but I am not sure.
It should probably be better to write a script which:
-makes a histogram plot and/or runs the binarising script for each csv file in the folder
-and puts all of the images into a new, designated folder, so I can sift through these.
I would greatly appreciate pointers on how to do this. As I mentioned, I am quite new to programming and am getting overwhelmed when looking at suggestions, seeing various different commands used to apparently achieve the same thing- reading several files at once.
The function csvread returns natively a matrix. I am not sure but it is possible that if some elements inside the csv file are not numbers, Matlab automatically makes a cell array out of the output. Since I don't know the structure of your csv-files I will recommend you trying out some similar functions(readtable, xlsread):
M = readtable(d(i).name) % Reads table like data, most recommended
M = xlsread(d(i).name) % Excel like structures, but works also on similar data
Try them out and let me know if it worked. If not please upload a file sample.
The function csvread(filename)
always return the matrix M that is numerical matrix and will never give the cell as return.
If you have textual data inside the .csv file, it will give you an error for not having the numerical data only. The only reason I can see for using the cell array when reading the files is if the dimensions of individual matrices read from each file are different, for example first .csv file contains data organised as 3xA, and second .csv file contains data organised as 2xB, so you can place them all into a single structure.
However, it is still possible to use histogram on cell array, by extracting the element as an array instead of extracting it as cell element.
If M is a cell matrix, there are two options for extracting the data:
M(i) and M{i}. M(i) will give you the cell element, and cannot be used for histogram, however M{i} returns element in its initial form which is numerical matrix.
TL;DR use histogram(M{i}) instead of histogram(M(i)).

Why the "imrotate" function clone writen by myself in Matlab is very slow

Sorry for my bad English.
I'm learning Matlab for image processing so I tried to clone the default "imrotate" function.
The first version I wrote involved using for-loop to traverse the whole matrix so it's very very slow.
Then I read this thread:
Image rotation by Matlab without using imrotate
and try to vectorize my program and it became "much" faster.
But it is still very slow compared to the default imrotate implementation consuming more than 1 second to rotate the image(resolution is 1920x1080), while the default implementation do the job in less than 50 millisecond.
So I wonder there would be still something wrong with my code or it is "normal" in matlab?
Here is my code:
(P.S. there is some ugly code (value11=...;value12=...value21=...) because I am not familiar with Matlab and unable to figure out shorter code not using loop.
function result=my_imRotate(image,angel,method)
function result=my_imRotateSingleChannel(image,angel,method)
angel=-angel/180*pi; %transform the angel from deg to rad
[height,width]=size(image); %get the image size
trMatrix=[cos(angel),-sin(angel);sin(angel),cos(angel)]; %the transformation matrix
imgSizeVec=[width,height;width,-height]; %the "size vector" to be transformed to caluclate the new size
newImgSizeVec=imgSizeVec*trMatrix;
newWidth=ceil(max(newImgSizeVec(:,1)));
newHeight=ceil(max(newImgSizeVec(:,2))); %caluculate the new size
[oldX,oldY]=meshgrid(1:newWidth,1:newHeight);
oldX=oldX-newWidth/2;
oldY=oldY-newHeight/2;
temp=[oldX(:) oldY(:)]*trMatrix;
oldX=temp(:,1);
oldY=temp(:,2);
oldX=oldX+width/2;
oldY=oldY+height/2;
switch(method)
case 'nearest'
oldX=round(oldX);
oldY=round(oldY);
condition=( oldX>=1 & oldX<=width & oldY>=1 & oldY<=height );
result(condition)=image((oldX(condition)-1)*height+oldY(condition));
result(~condition)=0;
result=reshape(result,newHeight,newWidth);
case 'bilinear'
x1=floor(oldX);
x2=x1+1;
y1=floor(oldY);
y2=y1+1;
condition11=(x1>=1&x1<=width&y1>=1&y1<=height);
condition12=(x1>=1&x1<=width&y2>=1&y2<=height);
condition21=(x2>=1&x2<=width&y1>=1&y1<=height);
condition22=(x2>=1&x2<=width&y2>=1&y2<=height);
value11(condition11)=double(image((x1(condition11)-1)*height+y1(condition11)));
value12(condition12)=double(image((x1(condition12)-1)*height+y2(condition12)));
value21(condition21)=double(image((x2(condition21)-1)*height+y1(condition21)));
value22(condition22)=double(image((x2(condition22)-1)*height+y2(condition22)));
value11(~condition11)=0;
value12(~condition12)=0;
value21(~condition21)=0;
value22(~condition22)=0;
result=uint8(value22.'.*(oldX-x1).*(oldY-y1)+value11.'.*(x2-oldX).*(y2-oldY)+value21.'.*(oldX-x1).*(y2-oldY)+value12.'.*(x2-oldX).*(oldY-y1));
result=reshape(result,newHeight,newWidth);
otherwise
disp('Sorry, unsupported algorithm. Only nearest/bilinear is supported.');
end
end
imageInfo=size(image);
imageType=size(imageInfo);
if(imageType(2)==2)
result=my_imRotateSingleChannel(image,angel,method);
elseif(imageType(2)==3&&imageInfo(3)==3)
temp=my_imRotateSingleChannel(image(:,:,1),angel,method);
[newHeight,newWidth]=size(temp);
result=temp(:);
temp=my_imRotateSingleChannel(image(:,:,2),angel,method);
result=[result temp(:)];
temp=my_imRotateSingleChannel(image(:,:,2),angel,method);
result=[result temp(:)];
result=reshape(result,newHeight,newWidth,3);
else
disp('Sorry, unsupported input matrix. Only grey/rgb image is supported.');
end
end
The bulk of the work in imrotate is done by the internal imrotatemex command, which is implemented in compiled C code.
Your code creates numerous temporary arrays that are as big as the original image. This is typical of vectorized Matlab code. Memory allocation takes a little bit of time but nothing close to 1sec. I'm guessing the main issue is that each new array is not in the cache, so you are constantly reading and writing from "cold" main memory. In comparison, the compiled C code uses a for loop, probably similar to your original implementation, keeping all the necessary data in CPU registers or on the stack, performing much better with the cache.
But #Shai is right - you should profile and find out which parts are slow. 1 second sounds too slow for rotating an HD image.

Create a circular buffer for image acquisition

I'm new in programming with matlab and trying to do the following:
I continously capture an image (size 1024x1024) with a camera to have a live image using the getdata function.
To do a measurement I would like to store only 100 images using a circular buffer- more precisely I'm thinking of storing 100 images and erasing the oldest images if new data is acquired and do a measurement on the last 100 images.
Hope my concern is understandable...
Thanks for an answer!
This question has been answered here by a worker from MathWorks : Create a buffer matrix for continuous measurements. ( He also made a video of it : http://blogs.mathworks.com/videos/2009/05/08/implementing-a-simple-circular-buffer/
The code :
buffSize = 10;
circBuff = nan(1,buffSize);
for newest = 1:1000;
circBuff = [newest circBuff(1:end-1)]
end
Check the update made by gnovice which applies the circular buffer to image processing.
What you call a "circular buffer" is known as a queue or FIFO (First In, First Out). Usually this would be stored in a linked list data structure, where every object (matrix, in your case) points to the next object. In Matlab however, there is not built-in linked list structure, but Matlab arrays (vectors/matrices) are pretty flexible and efficient when it comes to manipulating them.
So you can simply store each image as a matrix inside an array of length 100, giving you a 3 dimensional matrix of dimensions 100x1024x1024. Then, when you get new data you simply remove the last matrix from the array and insert a new matrix at the beginning of the array. Hopefully this will be fast enough for you.
Good luck!
May you can create an array of 100 1024x1024-matrices. and refer the following link to maintain the read and write position.
logic of circular buffer