I recently started getting the error
MATLAB: corrupted double-linked list
about 90% of the time when running a moderately complex matlab model on a supercomputing cluster.
The model runs fine on my laptop [about 15 hours for a run, cluster is used for parameter sweeps], and has done for nearly 2 years.
The only difference in recent runs is that the output is more verbose and creates a large array that wasn't there previously (1.5 Gb).
The general pattern for this array is that it is a 3D array, built from saving 2D slices of the model each timestep. The array is initialised outside of the timestepping loop, and slices are overwritten as the model progresses
%** init
big_array = zeros(a,b,c)
%** Loop
for i=1:c
%%%% DO MODEL %%%%
%** Save to array
big_array(:,:,i) = modelSnapshot';
end
I have checked that the indexing of this array is correct (ie. big_array(:,:,i) = modelSnapshot' has the correct dimensions / size)
Does anyone have any experience with this error and can point to solutions?
The only relevant results I can see on google are for matlabs' mex-file stuff, which is not active in my model
(crashes are on matlab 2016a, laptop runs 2014a)
Related
This is a cross-post from here:
Link to post in the Mathworks community
Currently I'm working with large data sets, I've saved those data set as matlab files with the two biggest files being 9.5GB and 5.9GB.
These files contain a cell array each of 1x8 (this is done for addressibility and to prevent mixing up data from each of the 8 cells and I specifically wanted to avoid eval).
Each cell then contains a 3D double matrix, for one it's 1001x2002x201 and the other it is 2003x1001x201 (when process it I chop of 1 row at the end to get it to 2002).
Now I'm already running my script and processing it on a server (64 cores and plenty of RAM, matlab crashed on my laptop, as I need more than 12GB ram on windows). Nonetheless it still takes several hours to finish running my script and I still need to do some extra operations on the data which is why I'm asking advice.
For some of the large cell arrays, I need to find the maximum value of the entire set of all 8 cells, normally I would run a for loop to get the maximum of each cel and store each value in a temporay numeric array and then use the function max again. This will work for sure I'm just wondering if there's a better more efficient way.
After I find the maximum I need to do a manipulation over all this data as well, normally I would do something like this for an array:
B=A./maxvaluefound;
A(B > a) = A(B > a)*constant;
Now I could put this in a for loop, adress each cell and run this, however I'm not sure how efficient that would be though. Do you think there's a better way then a for loop that's not extremely complicated/difficult to implement?
There's one more thing I need to do which is really important, each cell as I said before is a slice (consider it time), while inside each slide is the value for a 3D matrix/plot. Now I need to integrate the data so that I get more slices. The reason I need to do this that I need to create slices/frames/plots to create a movie/gif. I'm planning on plotting the 3d data using scatter3 where this data is represented by color. I plan on using alpha values to make it see through so that one can actually see the intensity in this 3d plot. However I understand how to use griddata but apparently it's quite slow. Some of the other methods where hard to understand. Thus what would be the best way to interpolate these (time) slices in an efficient way over the different cells in the cell array? Please explain it if you can, preferably with an example.
I've added a pic for the Linux server info I'm running it on below, note I can not update the matlab version unfortunately, it's R2016a:
I've also attached part of my code to give a better idea of what I'm doing:
if (or(L03==2,L04==2)) % check if this section needs to be executed based on parameters set at top of file
load('../loadfilewithpathnameonmypc.mat')
E_field_650nm_intAll=cell(1,8); %create empty cell array
parfor ee=1:8 %run for loop for cell array, changed this to a parfor to increase speed by approximately 8x
E_field_650nm_intAll{ee}=nan(szxit(1),szxit(2),xres); %create nan-filled matrix in cell 1-8
for qq=1:2:xres
tt=(qq+1)/2; %consecutive number instead of spacing 2
T1=griddata(Xsall{ee},Ysall{ee},EfieldsAll{ee}(:,:,qq)',XIT,ZIT,'natural'); %change data on non-uniform grid to uniform gridded data
E_field_650nm_intAll{ee}(:,:,tt)=T1; %fill up each cell with uniform data
end
end
clear T1
clear qq tt
clear ee
save('../savelargefile.mat', 'E_field_650nm_intAll', '-v7.3')
end
if (L05==2) % check if this section needs to be executed based on parameters set at top of file
if ~exist('E_field_650nm_intAll','var') % if variable not in workspace load it
load('../loadanotherfilewithpathnameonmypc.mat');
end
parfor tt=1:8 %run for loop for cell array, changed this to a parfor to increase speed by approximately 8x
CFxLight{tt}=nan(szxit(1),szxit(2),xres); %create nan-filled matrix in cells 1 to 8
for qq=1:xres
CFs=Cafluo3D{tt}(1:lxq2,:,qq)'; %get matrix slice and tranpose matrix for point-wise multiplication
CFxLight{tt}(:,:,qq)=CFs.*E_field_650nm_intAll{tt}(:,:,qq); %point-wise multiple the two large matrices for each cell and put in new cell array
end
end
clear CFs
clear qq tt
save('../saveanotherlargefile.mat', 'CFxLight', '-v7.3')
end
Well, I am trying to implement an algorithm on Matlab. It requires the usage of a slice of an high dimensional array inside a for loop. When I try to use the logical indexing, Matlab creates an additional copy of that slice and since my array is huge, it takes a lot of time.
slice = x(startInd:endInd);
What I am trying to do is to use that slice without copying it. I just need the slice data to input a linear operator. I won't update that part during the iterations.
To do so, I tried to write a Mex file whose output is a double
type array and whose size is equal to the intended slice data size.
plhs[0] = mxCreateUninitNumericMatrix(0, 0, mxDOUBLE_CLASS,mxREAL); % initialize but do not allocate any additional memory
ptr1 = mxGetPr(prhs[0]); % get the pointer of the input data
Then set the pointer of the output to the starting index of the input data.
mxSetPr(plhs[0], ptr1+startInd);
mxSetM(plhs[0], 1);
mxSetN(plhs[0], (endInd-startInd)); % Update the dimensions as intended
When I set the starting index to be zero, it just works fine. When I try to give
other values than 0, Mex file compiles with no error but Matlab crashes when the Mex function is called.
slice = mex_slicer(x, startInd, endInd);
What might be the problem here?
The way you assign the data pointer to the array, it means that MATLAB will attempt to free that memory when the array is deleted or has something else assigned to it. Attempting to call free using a pointer that was not obtained by malloc will cause a crash.
Unfortunately, MATLAB does not support "views", arrays that point at parts of a different array. So there is no way to do what you want to do.
An alternative solution would be to:
store your data differently, so that it doesn't take as much time to index (e.g. in smaller arrays)?
perform all your computations in C or C++ inside a MEX-file, where you can very simply point at sub-ranges of a larger data block.
See this FEX submission on creating MATLAB variables that "point" to the interior data of an existing variable. You can either do it as a shared data copy which is designed to be safe (but incurs some additional overhead), or as an unprotected direct reference (faster but risks crashing MATLAB if you don't clear it properly).
https://www.mathworks.com/matlabcentral/fileexchange/65842-sharedchild-creates-a-shared-data-copy-of-a-contiguous-subsection-of-an-existing-variable
I want to run a parfor loop in MATLAB with following code
B=load('dataB.mat'); % B is a 1600*100 matrix stored as 'dataB.mat' in the local folder
simN=100;
cof=cell(1,simN);
se=cell(1,simN);
parfor s=1:simN
[estimates, SE]=fct(0.5,[0.1,0.8,10]',B(:,s));
cof{s}=estimates';
se{s}=SE';
end
However, the codes seem not work - there are no warnings, it is just running forever without any outputs - I terminate the loop and found it never entered into the function 'fct'. Any help would be appreciated on how to load external data like 'dataB.mat' in the parallel computing of MATLAB?
If I type this on my console:
rand(1600,100)
and then I save my current workspace as dataB.mat, this command:
B = load('dataB.mat');
will bring me a 1 by 1 struct containing ans field as a 1600x100 double matrix. So, since in each loop of your application you must extract a column of B before calling the function fct (the extracted column becomes the third argument of your call and it must be defined before passing it)... I'm wondering if you didn't check your B variable composition with a breakpoint before proceeding with the parfor loop.
Also, keep in mind that the first time you execute a parfor loop with a brand new Matlab instance, the Matlab engine must instantiate all the workers... and this may take very long time. Be patient and, eventually, run a second test to see if the problem persists once you are certain the workers have been instantiated.
If those aren't the causes of your issue, I suggest you to run a standard loop (for instead of parfor) and set a breakpoint into the first line of your iteration. This should help you spot the problem very quickly.
Sorry for my bad English.
I'm learning Matlab for image processing so I tried to clone the default "imrotate" function.
The first version I wrote involved using for-loop to traverse the whole matrix so it's very very slow.
Then I read this thread:
Image rotation by Matlab without using imrotate
and try to vectorize my program and it became "much" faster.
But it is still very slow compared to the default imrotate implementation consuming more than 1 second to rotate the image(resolution is 1920x1080), while the default implementation do the job in less than 50 millisecond.
So I wonder there would be still something wrong with my code or it is "normal" in matlab?
Here is my code:
(P.S. there is some ugly code (value11=...;value12=...value21=...) because I am not familiar with Matlab and unable to figure out shorter code not using loop.
function result=my_imRotate(image,angel,method)
function result=my_imRotateSingleChannel(image,angel,method)
angel=-angel/180*pi; %transform the angel from deg to rad
[height,width]=size(image); %get the image size
trMatrix=[cos(angel),-sin(angel);sin(angel),cos(angel)]; %the transformation matrix
imgSizeVec=[width,height;width,-height]; %the "size vector" to be transformed to caluclate the new size
newImgSizeVec=imgSizeVec*trMatrix;
newWidth=ceil(max(newImgSizeVec(:,1)));
newHeight=ceil(max(newImgSizeVec(:,2))); %caluculate the new size
[oldX,oldY]=meshgrid(1:newWidth,1:newHeight);
oldX=oldX-newWidth/2;
oldY=oldY-newHeight/2;
temp=[oldX(:) oldY(:)]*trMatrix;
oldX=temp(:,1);
oldY=temp(:,2);
oldX=oldX+width/2;
oldY=oldY+height/2;
switch(method)
case 'nearest'
oldX=round(oldX);
oldY=round(oldY);
condition=( oldX>=1 & oldX<=width & oldY>=1 & oldY<=height );
result(condition)=image((oldX(condition)-1)*height+oldY(condition));
result(~condition)=0;
result=reshape(result,newHeight,newWidth);
case 'bilinear'
x1=floor(oldX);
x2=x1+1;
y1=floor(oldY);
y2=y1+1;
condition11=(x1>=1&x1<=width&y1>=1&y1<=height);
condition12=(x1>=1&x1<=width&y2>=1&y2<=height);
condition21=(x2>=1&x2<=width&y1>=1&y1<=height);
condition22=(x2>=1&x2<=width&y2>=1&y2<=height);
value11(condition11)=double(image((x1(condition11)-1)*height+y1(condition11)));
value12(condition12)=double(image((x1(condition12)-1)*height+y2(condition12)));
value21(condition21)=double(image((x2(condition21)-1)*height+y1(condition21)));
value22(condition22)=double(image((x2(condition22)-1)*height+y2(condition22)));
value11(~condition11)=0;
value12(~condition12)=0;
value21(~condition21)=0;
value22(~condition22)=0;
result=uint8(value22.'.*(oldX-x1).*(oldY-y1)+value11.'.*(x2-oldX).*(y2-oldY)+value21.'.*(oldX-x1).*(y2-oldY)+value12.'.*(x2-oldX).*(oldY-y1));
result=reshape(result,newHeight,newWidth);
otherwise
disp('Sorry, unsupported algorithm. Only nearest/bilinear is supported.');
end
end
imageInfo=size(image);
imageType=size(imageInfo);
if(imageType(2)==2)
result=my_imRotateSingleChannel(image,angel,method);
elseif(imageType(2)==3&&imageInfo(3)==3)
temp=my_imRotateSingleChannel(image(:,:,1),angel,method);
[newHeight,newWidth]=size(temp);
result=temp(:);
temp=my_imRotateSingleChannel(image(:,:,2),angel,method);
result=[result temp(:)];
temp=my_imRotateSingleChannel(image(:,:,2),angel,method);
result=[result temp(:)];
result=reshape(result,newHeight,newWidth,3);
else
disp('Sorry, unsupported input matrix. Only grey/rgb image is supported.');
end
end
The bulk of the work in imrotate is done by the internal imrotatemex command, which is implemented in compiled C code.
Your code creates numerous temporary arrays that are as big as the original image. This is typical of vectorized Matlab code. Memory allocation takes a little bit of time but nothing close to 1sec. I'm guessing the main issue is that each new array is not in the cache, so you are constantly reading and writing from "cold" main memory. In comparison, the compiled C code uses a for loop, probably similar to your original implementation, keeping all the necessary data in CPU registers or on the stack, performing much better with the cache.
But #Shai is right - you should profile and find out which parts are slow. 1 second sounds too slow for rotating an HD image.
I have a MATLAB variable that is a 3x6 cell array. One of the columns of the cell array holds at most 150-200 small RGB images, like 16x20 pixel size (again, at most). The rest of the columns are:
an equal number of labels that are strings of a 3 or 4 characters,
an image mask, which is about 350x200
3 integers
For some reason saving this object is taking a very long time, or at least for the size of the object. It has already been 10 minutes(which isn't too bad, but I plan on expanding the size of the object to hold several thousand of those small images) and MATLAB doesn't seem to be making any progress. In fact, when I open the containing directory of the variable, its size is cycling between 0 bytes to about 120kB. (i.e. it will increase to 120 in steps of 30 or 40 kB, then restart).
Is this normal behavior? Do MATLAB variables always take so long to save? What's going on here?
Mistake: I'm saving AllData, not my SVM variable. AllData has the same data as the SVM keeper, less the actual SVM itself and one integer.
What particular points of the code would be helpful to show for solving this? The code itself is a few hundred lines and broken up in several functions. What would be important to consider to troubleshoot this? When the variable is created? when it's saved? The way I create the smaller images?
Hate to be the noob who takes a picture of their desktop. but the machine I'm working has problems taking screenshots. Anyway, here it is
Alldata/curdata are just subsets of the 3x7 array... actually it's a 3x8, but the last is just an int.
Interesting side point: I interrupted the saving process and the variable seemed to save just fine. I trained a new svm on the saved data and it works just fine. I'd like to not do that in the future though.
Using whos:
Name Size Bytes Class Attributes
AllData 3x6 473300 cell
Image 240x352x3 253440 uint8
RESTOREDEFAULTPATH_EXECUTED 1x1 1 logical
SVMKeeper 3x8 2355638 cell
ans 3x6 892410 cell
curData 3x6 473300 cell
dirpath 1x67 134 char
im 240x352x3 1013760 single
s 1x1 892586 struct
Updates:
1.Does this always happen, or did you only do it once?
-It always happens
2.Does it take the same time when you save it to a different (local) drive?
-I will investigate this more when I get back to that computer
3.How long does it take to save a 500kb matrix to that folder?
-Almost instantaneous
4.And as asked above, what is the function call that you use?
-Code added below
(Image is a rgb image)
MaskedImage(:,:,1)=Image(:,:,1).*Mask;
MaskedImage(:,:,2)=Image(:,:,2).*Mask;
MaskedImage(:,:,3)=Image(:,:,3).*Mask;
MaskedImage=im2single(MaskedImage);
....
(I use some method to create a bounding box around my 16x20 image)
(this is in a loop of that occurs about 100-200 times)
Misfire=input('is this a misfire?','s');
if (strcmpi(Misfire,'yes'))
curImageReal=MaskedImage(j:j+Ybound,i:i+Xbound,:);
Training{curTrainingIndex}=curImageReal; %Training is a cell array of images
Labels{curTrainingIndex}='ncr';
curTrainingIndex=curTrainingIndex+1;
end
(the loop ends)...
SaveAndUpdate=input('Would you like to save this data?(say yes,definitely)','s');
undecided=1;
while(undecided)
if(strcmpi(SaveAndUpdate,'yes,definitely'))
AllData{curSVM,4}=Training;
AllData{curSVM,5}=Labels;
save(strcat(dirpath,'/',TrainingName),'AllData'); <---STUCK HERE
undecided=0;
else
DontSave=input('Im not going to save. Say YESNOSAVE to NOT SAVE','s')
if(strcmpi(DontSave,'yesnosave'))
undecided=0;
else
SaveAndUpdate=input('So... save? (say yes,definitely)','s');
end
end
end
It is a bit unclear if you are doing some custom file saving or not. If it is the first, I'm guessing that you have a really slow save loop going on, maybe some hardware issues. Try to save the data using MATLAB's save function:
tic
save('test.mat', 'AllData')
toc
if that works fine try to work your way from there e.g. saving one element at a time etc.
You can profile your code by using the profiler, open it with the command profile viewer and then type in the code, script or function that you want to profile in the input text field.
This isn't a great answer, but it seems that the problem was that I was saving the version of my image after I had converted it to a single. I don't know why this would cause such a dramatic slowdown (after removing this line of code it worked instantly) so if someone could edit my answer to shed more light on the situation, that would be appreciated.