Using Parsave inside parfor loop - saves only my last variable result - matlab

This is a simplified version of my code. What i want to obtain is a variable1 - row vector contained in result.mat. Problem is that parsave - saves only one result, the last one from the 10th iteration. What can i do to save all results in one vector (in variable1) inside the parfor loop?
parfor ii = 1:10
[variable1, variable2] = MyFunction(~,~,ii);
parsave('result.mat',variable1, variable2)
end
function parsave(filename, varargin)
narginchk(2, Inf);
nargoutchk(0, 0);
for I = 2:nargin
varname = genvarname(inputname(I));
eval([varname ' = varargin{' num2str(I-1) '};'])
if (I == 2)
save(filename, varname)
else
save(filename, varname, '-append')
end
end

Inside parsave, you have the following statement:
if (I == 2)
save(filename, varname)
else
save(filename, varname, '-append')
end
Thus, on your first pass of parsave, every time you run parsave, you overwrite filename.
In addition, your code is a huge risk for a race-condition: Assume two processes try to do the "initial save" of the file at the same time - one is going to overwrite the other, or throws a "cannot access file" error.
You're much better off saving to temp files in the parfor loop, and to follow this up with an aggregator function that combines the different saves into something useful - or simply save into a new directory every time you run the loop, rather than creating a new file.

Got an elegant solution on http://www.mathworks.com/matlabcentral/answers/179884-save-inside-parfor-loop-at-a-specific-iteration-step
parpool('local',10); % results will be distributed on 10 workers
output1=distributed.NaN(1,1e5); %pre-allocate
output2=distributed.NaN(1,1e5);
spmd
for i=drange(1:1e5)
[output1(ii), output2(ii)] = MyFunction(~,~,ii);
parsave(['output1', num2str(labindex)],...
getLocalPart(output1));
parsave(['output2', num2str(labindex)],...
getLocalPart(output2));
end
end
I've used 2 times the parsave function so in case that the machine shuts down before the simulation ends, 10 mat files will be saved with temporary results that I'll need to concatenate later (for output1 & output2)
The parsave function is:
function parsave(fname,data)
var_name=genvarname(inputname(2));
eval([var_name '=data'])
try
save(fname,var_name,'-append')
catch
save(fname,var_name)
end
% Written by Minjie Xu (<mailto:chokkyvista06#gmail.com chokkyvista06#gmail.com>)

Related

Fast way to check if variable is in .mat file without loading .mat file? 'who'/'whos' is not faster than loading.. Better options than 'who'?

I have a .mat file named "myfile.mat" that contains a huge varible data and, in some cases, another variable data_info. What is the fastest way to check if that .mat file contains the `data_info' variable?
the who or whos commands are not faster than simply loading and testing for the existens of varible.
nRuns=10;
%simply loading the complete file
tic
for p=1:nRuns
load('myfile.mat');
% do something with variable
if exist('data_info','var')
%do something
end
end
toc
% check with who
tic
for p=1:nRuns
variables=who('-file','myfile.mat');
if ismember('data_info', variables)
% do something
end
end
toc
% check with whose
tic
for p=1:nRuns
info=whos('-file','myfile.mat');
if ismember('data_info', {info.name})
%do something
end
end
toc
All methods roughly take the same time (which is way to slow, since data is huge.
However, this is very fast:
tic
for p=1:nRuns
load('myfile.mat','data_info');
if exist('data_info', 'var')
%do something
end
end
toc
But it issues a warning, if data_info does not exist. I could suppress the warning, but that doesn't seem like the best way to do this.. What other options are there?
Edit
using who('-file', 'myfile.mat', 'data_info') is also not faster:
tic
for p=1:nRuns
if ~isempty(who('-file', 'myfile.mat', 'data_info'))
% do something
end
end
toc % this takes 7 seconds, roughly the same like simply loading complete .mat file
Try using who restricting it to only the specific variable:
...
if ~isempty(who('-file', 'myfile.mat', 'data_info'))
%do something
end
Timing the solutions:
Using timeit on the different solutions (code included below, running on Windows 7 and MATLAB version R2016b) shows that the who-based ones appear fastest, with the one I suggested above having a slight edge in speed. Here's the timing, from slowest to fastest:
Load whole file: 0.368235871921381 sec
Using matfile: 0.001973860748417 sec
Load only `data_info`: 0.000316989486384 sec
Using whos + ismember: 0.000174207817967 sec
Using who + ismember: 0.000151289605527 sec
Using who + isempty: 0.000137261391331 sec
I used a sample MAT file containing the following variables:
data = ones(10000);
data_info = 'hello';
Here's the test code:
function T = infotest
T = zeros(6, 1);
T(1) = timeit(#use_load_exist_1);
T(2) = timeit(#use_load_exist_2);
T(3) = timeit(#use_matfile);
T(4) = timeit(#use_whos_ismember);
T(5) = timeit(#use_who_ismember);
T(6) = timeit(#use_who_isempty);
end
function isThere = use_load_exist_1
load('infotest.mat');
isThere = exist('data_info', 'var');
end
function isThere = use_load_exist_2
load('infotest.mat', 'data_info');
isThere = exist('data_info', 'var');
end
function isThere = use_matfile
isThere = isprop(matfile('infotest.mat'), 'data_info');
end
function isThere = use_whos_ismember
info = whos('-file', 'infotest.mat');
isThere = ismember('data_info', {info.name});
end
function isThere = use_who_ismember
variables = who('-file', 'infotest.mat');
isThere = ismember('data_info', variables);
end
function isThere = use_who_isempty
isThere = ~isempty(who('-file', 'infotest.mat', 'data_info'));
end
You can use the who command https://www.mathworks.com/help/matlab/ref/who.html
The syntax for this is to call who with the indicator of the file and then the variable you are looking for. You do not need to look for all the variables in the file
Dummy syntax is as follows
variable = who('-file','yourfilenamehere','data_info')
From there you can call
if ~isempty(variable)
%do something
end
This searches for only that variable within the file. In your versions of the who command you looked for all variables whereas this just looks for one.
So its a bit messy, but I just tried this and its pretty much instant regardless of size. Let me know if it works for you.
Please excuse the formatting, im not used to proper formatting here.
Note: This solution uses low level HDF5 libraries that are already built into matlab, so this method assumes your mat file is HDF5 (-v7.3). Otherwise it will not work.
You can be sure is a valid hdf5 file by doing this:
isValidHDF = H5F.is_hdf5('my_file.mat');
To see if your variable exists:
isThere = false; %Initialize as default value of false
fid = H5F.open('myfile.mat') % Use low level H5F builtin to open
try % Never use try/catch but this is a good for when its ok
% Try to open the h5 group. Will error and catch to report back false if the variable isnt there, otherwise the variable exists
gid = H5G.open(fid,['/data_info']); % Note: the "/" is required and OS independent, so its never "\" even in windows
% I think this makes sure the variable isnt empty if the group opened successfully, but it hasnt been a problem yet
hInfo = H5G.get_info(gid);
isThere = hInfo.nlinks > 0;
H5G.close(gid);
end
H5F.close(fid);

Continue ‘for’ loop with the existing variables in Matlab

I have a Matlab script including a for loop which loos like the following:
for k = 1:10
c = myfun(k,a,b);
result{k} = c;
end
Right now, the problem is that during the for loop, sometimes myfun() may have errors and stop. After fixing the error in myfun(), how can I continue to run with the existing value of variables? The reason is that myfun() will take a very long time to get the result and the previous results are right.
For example, if a error happens when k == 4, then I save all the variables in the current workspace. I set a breakpoint at c = myfun(k,a,b); and restore the saved variables, but I find that in the next loop, k will be 2 instead of 5 as I want. Matlab is not allowed to modify the value of k during the for loop I think. I have tested this for a few times.
How can I continue the for loop with some existing data?
You cannot change your for loop iterator programmatically inside of the loop.
For example:
for ii = 1:3
disp(ii)
ii = 3;
end
Prints:
1
2
3
If you're going to be modifying code based on errors received, dbstop if error is not going to be beneficial because it will not reflect changes in your code until the debugger is exited and your code executed again (unless you execute manually in the debugger). If you're not modifying code you could potentially use a try/except clause to catch fixable issues.
If you're loading data for a later index and then restarting, you can change where your for loop begins, or use a while loop (if appropriate).
For example:
% Load data here
for ii = 3:3
disp(ii)
end
Prints 3.
Where the while interpretation would be:
% Load data here
ii = 3
while ii <= 3
disp(ii)
ii = ii + 1;
end
For the same result.
On solution can be first catch the exception likes the following and pass from them:
bug = [];
for k = 1:10
try
c = myfun(k,a,b);
result{k} = c;
catch
warning('some bug for the following values:');
display([k a b]);
bug = [bug; k a b];
result{k} = NaN;
end
end
Then iterate over bug to compute missing information after debugging. This solution works when your algorithm is not dependent on the previous value of the result (or is not recursive).

Running Matlab script many times

So I have a matlab.m file script. When the file runs. It generates a vector. I want to save that vector and rerun the script all over again. How do I put a loop on the entire script file and create a vector_{i}, where the index enters the name of the file? I would post the code but it wont work without the data on my desktop.
[data,labels]=xlsread('C:\Users\Hilbert\Desktop\matlab\matlabdata_abe.xlsx');
gdp=log(data(:,1)./lagmatrix(data(:,1),1)) %GDP
ip=log(data(:,2)./lagmatrix(data(:,2),1)) %IP
tnx=data(:,3) %TNX
m2=log(data(:,4)./lagmatrix(data(:,4),1)) %M2
cpi=log(data(:,5)./lagmatrix(data(:,5),1)) %CPI
ffed=log(data(:,6)./lagmatrix(data(:,6),1)) %FedFund
Dgdp=gdp
inflation=cpi
Dm2=m2
ffr_=ffed
data=[Dgdp(54:length(cpi)), inflation(54:length(cpi)), Dm2(54:length(cpi)), ffr_(54:length(cpi)) ];
data_L1=lagmatrix(data,1)
data_L2=lagmatrix(data,2)
data_L3=lagmatrix(data,3)
data_L4=lagmatrix(data,4)
mat=[ones(1,size(data_L1',2));data_L1';data_L2';data_L3';data_L4']
mat=mat(:,5:end)
X=[data';data_L1';data_L2';data_L3']
X=X(:,5:end)
mat=mat';
X=X'
Fhat=(inv(mat'*mat) * mat'*X)';
nobs=size(data,1)
p=4
yhat= mat*Fhat'
yhat=yhat(:,1:4)
data_sample=data(5:nobs,:)
res=data_sample - yhat
res_{loopindexnumber}=res %saves the vector and re-runs the entire cost again the idea is to bootstrap the data by running many simulations and saving the residual vector
Make the script a function. And then execute the function in a loop how many times you want. For example:
function res = my_function(k)
% your script goes here.
% the function is saved in my_function.m file
% some calucations producing return_vector using k parmeter
res = return_vector
Later on, just run a for loop over the function and store the results to a cell array:
for k = 1:10
A{k} = my_function(k)
end
Make the script a function. And then execute the function in a loop how many times you want. For example:
function res = my_function(k)
% your script goes here.
% the function is saved in my_function.m file
% some calucations producing return_vector using k parmeter
res = return_vector
Later on, just run a for loop over the function.
for k = 1:10
assignin('base', ['A_', num2str(k)], my_function(k))
end

Looping a Function in Matlab

total newbie here. I'm having problems looping a function that I've created. I'm having some problems copying the code over but I'll give a general idea of it:
function[X]=Test(A,B,C,D)
other parts of the code
.
.
.
X = linsolve(K,L)
end
where K,L are other matrices I derived from the 4 variables A,B,C,D
The problem is whenever I execute the function Test(1,2,3,4), I can only get one answer out. I'm trying to loop this process for one variable, keep the other 3 variables constant.
For example, I want to get answers for A = 1:10, while B = 2, C = 3, D = 4
I've tried the following method and they did not work:
Function[X] = Test(A,B,C,D)
for A = 1:10
other parts of the code...
X=linsolve(K,L)
end
Whenever I keyed in the command Test(1,2,3,4), it only gave me the output of Test(10,2,3,4)
Then I read somewhere that you have to call the function from somewhere else, so I edited the Test function to be Function[X] = Test(B,C,D) and left A out where it can be assigned in another script eg:
global A
for A = 1:10
Test(A,2,3,4)
end
But this gives an error as well, as Test function requires A to be defined. As such I'm a little lost and can't seem to find any information on how can this be done. Would appreciate all the help I can get.
Cheers guys
I think this is what you're looking for:
A=1:10; B=2; C=3; D=4;
%Do pre-allocation for X according to the dimensions of your output
for iter = 1:length(A)
X(:,:,iter)= Test(A(iter),B,C,D);
end
X
where
function [X]=Test(A,B,C,D)
%other parts of the code
X = linsolve(K,L)
end
Try this:
function X = Test(A,B,C,D)
% allocate output (it is faster than changing the size in every loop)
X = {};
% loop for each position in A
for i = 1:numel(A);
%in the other parts you have to use A(i) instead of just A
... other parts of code
%overwrite the value in X at position i
X{i} = linsolve(K,L);
end
end
and run it with Test(1:10,2,3,4)
To answer what went wrong before:
When you loop with 'for A=1:10' you overwrite the A that was passed to the function (so the function will ignore the A that you passed it) and in each loop you overwrite the X calculated in the previous loop (that is why you can only see the answer for A=10).
The second try should work if you have created a file named Test.m with the function X = (A,B,C,D) as the first code in the file. Although the global assignment is unnecessary. In fact I would strongly recommend you not to use global variables as it gets very messy very fast.

How does function work regarding to the memory usage?

When you are using function in MATLAB you have just the output of the function in the work space and all the other variables that maybe created or used in the body of that function are not shown. I am wondering how does function work? Does it clear all other variables from memory and just save the output?
function acts like a small, isolated programming environment. At the front end you insert your input (e.g. variables, strings, name-value pairs etc). After the function has finished, only the output is available, discarding all temporarily created variables.
function [SUM] = MySum(A)
for ii = 1:length(A)-1
SUM(ii) = A(ii)+A(ii+1);
kk(ii) = ii;
end
end
>> A=1:10
>> MySum(A)
This code just adds two consecutive values for the input array A. Note that the iteration number, stored in kk, is not output and is thus discarded after the function has completed. In MATLAB kk(ii) = ii; will be underlined orange, since it 'might be unused'.
Say you want to also retain kk, just add it to the function outputs:
function [SUM,kk] = MySum(A)
and keep the rest the same.
If you have large variables that you only use up to a certain point and wish them not clogging up your memory whilst the function is running, use clear for that:
function [SUM] = MySum(A)
for ii = 1:length(A)-1
SUM(ii) = A(ii)+A(ii+1);
kk(ii) = ii;
end
clear kk
end