I have a script that requires a handful of parameters to run. I'm interested in exploring the results as the parameters change, so I define a few scan arrays at the top, wrap the whole code in multiple for loops and set the parameters values to the current scan values.
This is error prone and inelegant. The process for changing the code is: 1) reset scan variables at the top, 2) comment out eg b = scan2(j2) and 3) uncomment b=b0.
What's a better method to allow variables to be set to arrays, and subsequently run the code for all such combinations? Example of my code now:
close all
clear all
%scan1 = linspace(1,4,10);
scan1 = 0;
scan2 = linspace(0,1,10);
scan3 = linspace(-1,0,10);
for j3 = 1:length(scan3)
for j2 = 1:length(scan2)
for j1 = 1:length(scan1)
a = a0;
%b = scan2(j2);
b = b0;
%c = c0;
c = scan3(j3);
d = scan2(j2);
%(CODE BLOCK THAT DEPENDS ON variables a,b,c,d...)
end
end
end
Based on this idea to use one for loop to simulate multiple loops, I tried to adapt it to your case. While fulfilling a good memory efficiency and usability, this solution is slower than using individual for loops.
%define your parameters
p.a = 1;
p.b = linspace(1,4,4);
p.c = linspace(11,15,5);
p.d = linspace(101,104,4);
p.e = 5;
iterations=structfun(#numel,p);
iterator=cell(1,numel(iterations));
for jx = 1:prod(iterations)
[iterator{:}]=ind2sub(iterations(:).',jx);%.'
%This line uses itertor to extract the corresponding elemets of p and creates a struct which only contains scalars.
q=cell2struct(cellfun(#(a,b)(a(b)),struct2cell(p),iterator(:),'uniform',false),fieldnames(p));
%__ (CODE THAT DEPENDS ON q.a to q.e here) __
end
For the scenarios I tested it adds an computation overhead below 0.0002s per iteration which is 0.0002.*prod(iterations)s in total.
One method is to make a single vector that contains all the parameter combinations, using ndgrid. For a sufficiently large parameter scans this may become a memory concern, but otherwise is at least much cleaner, requiring only a single loop and no re-assignments later in the code:
a0vec = 1;
b0vec = linspace(1,4,4);
c0vec = linspace(11,15,5);
d0vec = linspace(101,104,4);
e0vec = 5;
[a0s,b0s,c0s,d0s,e0s] = ndgrid(a0vec,b0vec,c0vec,d0vec,e0vec);
N = numel(a0s);
for j = 1:N
a0 = a0s(j);
b0 = b0s(j);
c0 = c0s(j);
d0 = d0s(j);
e0 = e0s(j);
%__ (CODE THAT DEPENDS ON a0 - e0 here) __
end
Would still like to see your suggestions!
Related
I'm trying to speed up the simulation of some panel data in Matlab. I have to simulate first over individuals (loop index ii from 1 to N) and then for each individual over age (loop index jj from 1 to JJ). The code is slow because inside the two loops there is a bilinear interpolation to do.
Since the iterations in the outer loop are independent, I tried to use parfor in the outer loop (the loop indexed by ii), but I get the error message "the parfor cannot run due to the way the variable hsim is used". Could someone explain why and how to solve the problem if possible? Any help is greatly appreciated!
a_sim = zeros(Nsim,JJ);
h_sim = zeros(Nsim,JJ);
% Find point on a_grid corresponding to zero assets
aa0 = find_loc(a_grid,0.0);
% Zero housing
hh0 = 1;
a_sim(:,1) = a_grid(aa0);
h_sim(:,1) = h_grid(hh0);
parfor ii=1:Nsim !illegal
for jj=1:JJ-1
z_c = z_sim_ind(ii,jj);
apol_interp = griddedInterpolant({a_grid,h_grid},apol(:,:,z_c,jj));
hpol_interp = griddedInterpolant({a_grid,h_grid},hpol(:,:,z_c,jj));
a_sim(ii,jj+1) = apol_interp(a_sim(ii,jj),h_sim(ii,jj));
h_sim(ii,jj+1) = hpol_interp(a_sim(ii,jj),h_sim(ii,jj));
end
end
I think #Ben Voigt's suggestion was correct. To spell it out, do something like this:
parfor ii=1:Nsim
a_sim_row = a_sim(ii,:);
h_sim_row = h_sim(ii,:);
for jj=1:JJ-1
z_c = z_sim_ind(ii,jj);
apol_interp = griddedInterpolant({a_grid,h_grid},apol(:,:,z_c,jj));
hpol_interp = griddedInterpolant({a_grid,h_grid},hpol(:,:,z_c,jj));
a_sim_row(jj+1) = apol_interp(a_sim_row(jj),h_sim_row(jj));
h_sim_row(jj+1) = hpol_interp(a_sim_row(jj),h_sim_row(jj));
end
a_sim(ii,:) = a_sim_row;
h_sim(ii,:) = h_sim_row;
end
This is a fairly standard parfor pattern to work around the limitation (in this case, parfor cannot spot that what you're doing is not order-independent as far as the outer loop is concerned) - extract a whole slice, do whatever is needed, then put the whole slice back.
Using Matlab R2019a, is there any way to avoid the for-loop in the following code in spite of the dimensions containing different element so that each element has to be checked? M is a vector with indices, and Inpts.payout is a 5D array with numerical data.
for m = 1:length(M)-1
for power = 1:noScenarios
for production = 1:noScenarios
for inflation = 1:noScenarios
for interest = 1:noScenarios
if Inpts.payout(M(m),power,production,inflation,interest)<0
Inpts.payout(M(m+1),power,production,inflation,interest)=...
Inpts.payout(M(m+1),power,production,inflation,interest)...
+Inpts.payout(M(m),power,production,inflation,interest);
Inpts.payout(M(m),power,production,inflation,interest)=0;
end
end
end
end
end
end
It is quite simple to remove the inner 4 loops. This will be more efficient unless you have a huge matrix Inpts.payout, as a new indexing matrix must be generated.
The following code extracts the two relevant 'planes' from the input data, does the logic on them, then writes them back:
for m = 1:length(M)-1
payout_m = Inpts.payout(M(m),:,:,:,:);
payout_m1 = Inpts.payout(M(m+1),:,:,:,:);
indx = payout_m < 0;
payout_m1(indx) = payout_m1(indx) + payout_m(indx);
payout_m(indx) = 0;
Inpts.payout(M(m),:,:,:,:) = payout_m;
Inpts.payout(M(m+1),:,:,:,:) = payout_m1;
end
It is possible to avoid extracting the 'planes' and writing them back by working directly with the input data matrix. However, this yields more complex code.
However, we can easily avoid some indexing operations this way:
payout_m = Inpts.payout(M(1),:,:,:,:);
for m = 1:length(M)-1
payout_m1 = Inpts.payout(M(m+1),:,:,:,:);
indx = payout_m < 0;
payout_m1(indx) = payout_m1(indx) + payout_m(indx);
payout_m(indx) = 0;
Inpts.payout(M(m),:,:,:,:) = payout_m;
payout_m = payout_m1;
end
Inpts.payout(M(m+1),:,:,:,:) = payout_m1;
It seems like there is not a way to avoid this. I am assuming that each for lop independently changes a variable parameter used in the main calculation. Thus, it is required to have this many for loops. My only suggestion is to turn your nested loops into a function if you're concerned about appearance. Not sure if this will help run-time.
I have a series of nested loops that works to store data in a cell array. I am trying to find ways to speed up the loop and also help to simplify the readability. I have already optimized the loop a fair bit, but would like to see if I could vectorize it further. My original code looked like this:
%% ORIGINAL LOOP
for iA = 1:length(arrA)
for iB = 1:length(arrB)
for iC = 1:length(arrC)
a = arrA(iA); % depends only on iA
a_x = AData.x(AData.a==a);
a_y = AData.y(AData.a==a);
b = arrB(iB); % depends only on iB
b_x = BData.x(BData.b==b);
b_y = BData.y(BData.b==b);
c = arrC(iC); % depends only on iC
FinalData{iA,iB,iC} = computedata(a_x, a_y, b_x, b_y, c);
end
end
end
Since the calculations for a, a_x, a_y depended only on iA I pulled them out of the inner loops, and did similarly for the other variables, which increased performance significantly:
%% FASTER LOOP
for iA = 1:length(arrA)
a = arrA(iA);
a_x = AData.x(AData.a==a);
a_y = AData.y(AData.a==a);
for iB = 1:length(arrB)
b = arrB(iB);
b_x = BData.x(BData.b==b);
b_y = BData.y(BData.b==b);
for iC = 1:length(arrC)
c = arrC(iC);
FinalData{iA,iB,iC} = computedata(a_x, a_y, b_x, b_y, c);
end
end
end
I am wondering if there yet a better way to speed up this process, perhaps by MATLAB vectorization (elimination of loops altogether).
I also wanted to make it more compact and easier to rearrange the order of the loops if need be, for other functions I plan to design for plotting things in various orders. Any tips would be greatly appreciated.
I have a multi-start fmincon code. One variable needs to be determined: u0. Inside ObjectiveFunc there is a variable, Parameter I need to output when running multi-start, so I am trying to output a parameter that changes inside an objective function. I wrote a simple example below.
How can I output the value of the Parameter inside Func(u0) below when running run(ms,Prob,big start)?
ObjectiveFunc = #(u0) Func(u0);
gs = GlobalSearch; ms = MultiStart(gs); opts = optimoptions(#fmincon);
Prob = createOptimProblem('fmincon','x0',1,'objective',ObjectiveFunc,'options',opts);
u0_ini_range = 0.1:1:20;
[u0_iniGrid] = ndgrid(u0_ini_range);
W = u0_iniGrid(:);
bigstart = CustomStartPointSet(W);
[u0_OptVal Delta_u0] = run(ms,Prob,bigstart);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function Delta_u0 = Func(u0)
Parameter = randn(1);
Delta_u0 = u0+Parameter;
There are a variety of ways to solve this issue. None of them are pretty. The problem is confounded due to the use of MultiStart, which runs multiple instances simultaneously, only one of which returns the global optimum. This means that the last evaluation of your objective function (and it's call to randn) isn't necessarily the relevant one.
Here's one way to accomplish this with persistent variables. First, you objective function need to look something like this (here Delta is your Parameter):
function [z,Delta_u0_Opt]=HBM1_SlipSolution02hbm3Fcn(u0)
persistent Delta idx;
if isempty(idx)
idx = 0; % Initialize persistent index
end
if nargout > 1
% Output Delta history
Delta_u0_Opt = Delta;
z = (u0-Delta).^2;
else
% Normal operation
idx = idx+1;
Delta(idx) = randn; % Save Delta history
z = (u0-Delta(idx)).^2;
end
Then run your setup code as before:
ObjectiveFunc = #(u0)HBM1_SlipSolution02hbm3Fcn(u0);
u0_ini0 = 0;
gs = GlobalSearch;
ms = MultiStart(gs);
opts = optimoptions(#fmincon);
Lowbound = 0.1;
Xst_rig = 1;
Upbound = 20*Xst_rig;
Prob = createOptimProblem('fmincon','x0',u0_ini0,'objective',ObjectiveFunc,...
'lb',Lowbound,'options',opts);
u0_ini_range = 0.1:1:Upbound;
[u0_iniGrid] = ndgrid(u0_ini_range);
W = u0_iniGrid(:);
bigstart = CustomStartPointSet(W);
[u0_OptVal,fval1] = run(ms,Prob,big start);
Then extract your Delta value:
[fval2,Delta_u0_Opt] = HBM1_SlipSolution02hbm3Fcn(u0_OptVal)
Delta_u0_Opt = Delta_u0_Opt(fval1==fval2)
The main problem with this solution is that the entire history of Delta must be retained via constantly appending to a vector. And it requires that the function value of the solution, fval1, uniquely match only one of fval2. The solution could probably be optimized a bit and the last issue resolved by saving more state history and clever use of an output function. I have no idea how or if this would work if you decide to turn on UseParallel. As you can see, this optimization scheme is not at all designed for what you're trying to get it to do.
Finally, are you sure that it's a good idea to use random values in the way that you are for this kind of optimization scheme? At minimum, be sure to specify a seed so results can be replicated. You might consider creating your own global optimization method based on fmincon if you want something more straightforward and efficient.
I have about 50 different arrays and I want to perform the following operation on all of them:
data1(isnan(data1)) = 0;
coldata1 = nonzeros(data1);
avgdata1 = mean(coldata1);
and so on for data2, data3 etc... the goal being to turn data1 into a vector without NaNs and then take a mean, saving the vector and the mean into coldata1 and avgdata1.
I'm looking for a way to automate this for all 50, rather than copy it 50 times and change the numbers... any ideas? I've been playing with eval but no luck so far. Also tried:
for y = 1:50
data(y)(isnan(data(y))) = 0;
coldata(y) = nonzeros(data(y));
avgdata(y) = mean(coldata(y));
end
You can do it with eval but really should not. Rather use a cell array as suggested here: Create variables with names from strings
i.e.
for y = 1:50
data{y}(isnan(data{y})) = 0;
coldata{y} = nonzeros(data{y});
avgdata{y} = mean(coldata{y});
end
Also read How can I create variables A1, A2,...,A10 in a loop? for alternative options.