Transparency Error when using table in parfor loop - matlab

I am trying to use a table within a parfor loop in MATLAB. This gives me "Transparency violation error. See Parallel Computing Toolbox about transparency" I'm trying to build this table so I can make a prediction using a trained classifier from the MATLAB classification learner app (trainedClassifier.prefictFcn(T))...so either I need to build a table within the parfor loop or need some alternative to a table that I can still feed into the classifier.
parfor i=1:100
acheck=1;
bcheck=2;
ccheck=3;
T=table(acheck,bcheck,ccheck);
end

This solution works for your particular problem:
parfor i=1:100
acheck=1;
bcheck=2;
ccheck=3;
T(i,:)=table([acheck,bcheck,ccheck]);
end
Note that in your original program you just overwrite existing values and end up with a one row table. I assumed that that was not intended. Actually, that would be the outcome of a for.
Also, since this is a parfor and T is created inside the loop (as well as acheck, etc.) using just T creates nothing at all. The variable is a temporary one, visible to each process locally and destroyed in global scope (more can be found here).
To fix both overwriting and accessibility the program assigns the each set of variables to each row of T. If square brackets are omitted, the program throws a transparency error. Unfortunately, I do not know why is that but it may be that the operations done by the table data-structure cause that. Maybe someone else will know the answer, for now this seem to solve your problem though.

Related

Broadcast variables in parfor loop in MATLAB

The following loop results in an error in C_mat and B_mat:
%previously defined
N_RIPETIZIONI=2;
K=201;
parfor n=1:N_RIPETIZIONI*K
[r,k]=ind2sub([N_RIPETIZIONI,K],n);
B=B_mat{r};
C=C_mat{r};
end
The warning says:
The entire array or structure B_mat is a broadcast variable. This might result in unnecessary communication overhead.
The same for C_mat.
How can I fix it so that the indices of B_mat and C_mat are no more broadcast variables?
The issue is that the way you index B_mat (i.e. not using n), every thread in the parfor requires the entirety of B_mat to run. The big bottleneck in parfor code is transferring copies of the data to each node.
MATLAB is basically telling you that if you were to do this, you may actually have slower code than otherwise. Its not that B_mat is some type of variable called "broadcast", its that the way you wrote the code, each n in parfor requires a copy of B_mat.
I assume this is not your real code, so we can't really help you fix it, but hopefully this explains it.

Parfor works with test data but not real data

I am using Matlab 2016a. I have four matrices of size 2044x1572x84 and am trying to regress each column of each matrix to produce a new 2044x1572 matrix of regression coefficients. I need to use parfor; a for loop would take way too long.
When I use the below code using test data (e.g. using rand to make four matrices of 50x50x40) the code executes with no errors. However, when I try using the same code in a cluster with the full 2044x1572x84 matrices I get a transparency violation error with regards to the table: Error using table (line 247) Transparency violation error. I've tried modifying the table code to fix this but only get a suite of other errors.
I'm unsure how to fix the error in this case, particularly given that the success of the code seems to be dependent on the size of the input data. I'm not particularly familiar with parfor, and any feedback on what I may be doing wrong would be greatly appreciated.
COEFF_LST=ones(2044,1572);
parfor i=1:2044
for j=1:1572
ZZ=squeeze(ARRAY_DETREND_L2_LST(i,j,:));
XX=squeeze(ARRAY_DETREND_L2_ONDVI(i,j,:));
YY=squeeze(ARRAY_DETREND_WB_85(i,j,:));
LL=squeeze(ARRAY_DETREND_L2_CNDVI(i,j,:));
T=table(ZZ,XX,YY,LL,'VariableNames',{'LST','ONDVI','DROUGHT','NDVI'});
lm=fitlm(T);
array=table2array(lm.Coefficients);
COEFF_LST(i,j)=array(3,1);
end
end
The table constructor uses inputname under certain circumstances - that can cause transparency violations inside parfor. I realise it's inconvenient, but perhaps you could try "hiding" the table call inside a separate function. I.e.
parfor ...
T = myTableBuilder(ZZ,XX,...);
end
function t = myTableBuilder(varargin)
t = table(varargin{:});
end
In this case I'm getting a transparency error with table, so a simple solution that works is to not use table.
In this case the code would be:
Predictor_Matrix=horzcat(ZZ,XX,YY);
lm = fitlm(Predictor_Matrix,WW);
This works on a cluster without throwing any errors.

How to load .mat files in the folder for parfor in MATLAB

I want to run a parfor loop in MATLAB with following code
B=load('dataB.mat'); % B is a 1600*100 matrix stored as 'dataB.mat' in the local folder
simN=100;
cof=cell(1,simN);
se=cell(1,simN);
parfor s=1:simN
[estimates, SE]=fct(0.5,[0.1,0.8,10]',B(:,s));
cof{s}=estimates';
se{s}=SE';
end
However, the codes seem not work - there are no warnings, it is just running forever without any outputs - I terminate the loop and found it never entered into the function 'fct'. Any help would be appreciated on how to load external data like 'dataB.mat' in the parallel computing of MATLAB?
If I type this on my console:
rand(1600,100)
and then I save my current workspace as dataB.mat, this command:
B = load('dataB.mat');
will bring me a 1 by 1 struct containing ans field as a 1600x100 double matrix. So, since in each loop of your application you must extract a column of B before calling the function fct (the extracted column becomes the third argument of your call and it must be defined before passing it)... I'm wondering if you didn't check your B variable composition with a breakpoint before proceeding with the parfor loop.
Also, keep in mind that the first time you execute a parfor loop with a brand new Matlab instance, the Matlab engine must instantiate all the workers... and this may take very long time. Be patient and, eventually, run a second test to see if the problem persists once you are certain the workers have been instantiated.
If those aren't the causes of your issue, I suggest you to run a standard loop (for instead of parfor) and set a breakpoint into the first line of your iteration. This should help you spot the problem very quickly.

Why does Matlab's clear violates transparency?

while using Matlab parfor I came across the following behaviour
parpool(2)
parfor j=1:100
v = j+1;
clear v
end
> Error in ==> parallel_function>make_general_channel/channel_general at 886
> Transparency violation error.
I looked into it, and indeed one is not allowed to use clear within parfor.
My question is why. v is created inside every specific worker, and so it does not interfere with other workers.
Matlab uses static code analyzer to understand how the body of parfor loop interacts with main workspace, i.e. which variables need to be transferred to workers and back. A number of functions, such as eval, evalc, evalin, assignin (with the workspace argument specified as 'caller'), load (unless the output is assigned to a variable), save and clear can modify workspace in ways that cannot be predicted by the static analyzer. There is no way to ensure integrity of the workspace when multiple workers are operating on it, and such functions are used.
Important thing to realize is that when you use a command syntax to invoke a function, such as clear v, the argument is passed as a string literal, meaning there is no way for the static analyzer to understand which variable you are trying to clear, hence no way to figure out the effect the command will have on the workspace.
As suggested in documentation, the workaround to free up most of the memory used by a variable inside parfor is: v = [];

MATLAB parfor - cannot determine whether "ModelUtil" refers to a function or variable?

I am calling external functions in my parfor loop as follows.
parfor idx = 1:2
import com.comsol.model.*
import com.comsol.model.util.*
model = ModelUtil.create('Model');
model.modelNode.create('comp1');
model.geom.create('geom1', 2);
model.geom('geom1').feature.create('sq1', 'Square');
model.geom('geom1').feature('sq1').set('size', '0.03125');
model.geom('geom1').feature('sq1').setIndex('pos', '0', 0);
model.geom('geom1').feature('sq1').setIndex('pos', '0', 1);
model.geom('geom1').run;
end
Error: MATLAB cannot determine whether "ModelUtil" refers to a function or variable.
See Parallel for Loops in MATLAB, "Unambiguous Variable Names".
After reading the "Unambiguous Variable Names" part in the MATLAB parfor documentation, I pretty much understand why this error occurs. However, I have no idea how to fix it.
I encountered the same problem with .Net objects. Like you, I found it works perfectly in non-parallel mode and the problem only occurs with parfor.
This is not a question of paths, this is purely a Matlab syntax/parser problem.
As a workaround, I wrapped the operations in a separate Matlab function so that within the scope of the parfor it is no longer ambiguous. Within the (otherwise unnecessary) function it has no trouble resolving the class, even when called from within parfor.
Vector's solution didn't work for me. The thing is that the different workers don't see the library you are using: you have to update the javapath in the parfor with javaaddpath.