functions inside a matlab parfor loop - matlab

Can you use functions inside a matlab parfor loop? for instance I have a code that looks like:
matlabpool open 2
Mat=zeros(100,8);
parfor(i=1:100)
Mat(i,:)=foo();
end
Inside the function I have a bunch of other variables. In particular there is a snippet of code that looks like this:
function z=foo()
err=1;
a=zeros(10000,1);
p=1;
while(err>.0001)
%statements to update err
% .
% .
% .
p=p+1;
%if out of memory allocate more
if(p>length(a))
a=[a;zeros(length(a),1)];
end
end
%trim output after while loop
if(p<length(a))
a(p+1:end)=[];
end
%example output
z=1:8;
end
I read somewhere that all variables that grow inside of a for loop nested inside of a matlab parfor loop must be preallocated, but in this case I have a variable that is preallocated, but might grow later. matlab did not give me any errors when i used mlint, but I was wondering if there are issues that I should be aware of.
Thanks,
-akt

According to Mathworks' documentation, your implementation of the matrix Mat is a sliced variable. That means you update different "slices" of the same matrix in different iterations, but the iterations do not affect each other. There is no data dependency between the loops. So you are ok to go.
Growing a inside function foo does not affect the parfor, because a is a regular variable located in foo's stack. You can do anything with a.
There do exist several issues to notice:
Don't use i and j as iteration counters
Defining i or j in any purpose is bad.
I have never been bored to refer people to this post - Using i and j as variables in Matlab.
Growing a is bad
Every time you do a=[a;zeros(length(a),1)]; the variable is copied as a whole into a new empty place in RAM. As its size doubles each time, it could be a disaster. Not hard to imagine.
A lighter way to "grow" -
% initialize a list of pointers
p = 1;
cc = 1;
c{1} = zeros(1000,1);
% use it
while (go_on)
% do something
disp(c{cc})
....
p=p+1;
if (p>1000)
cc = cc+1;
c{cc} = zeros(1000,1);
p = 1;
end
end
Here, you grow a list of pointers, a cell array c. It's smaller, faster, but still it needs copying in memory.
Use minimal amount of memory
Suppose you only need a small part of a, that is, a(end-8:end), as the function output. (This assumption bases on the caller Mat(i,:)=foo(); where size(Mat, 2)=8. )
Suppose err is not related to previous elements of a, that is, a(p-1), a(p-2), .... (I will loosen this assumption later.)
You don't have to keep all previous results in memory. If a is used up, just throw it.
% if out of memory, don't allocate more; flush it
if (p>1000)
p = 1;
a = zeros(1000,1);
end
The second assumption may be loosen to that you only need a certain number of previous elements, while this number is already known (hopefully it's small). For example,
% if out of memory, flush it, but keep the last 3 results
if (p>1000)
a = [a(end-3:end); zeros(997,1)];
p = 4;
end
Trimming is not that complicated
% trim output after while loop
a(p+1:end)=[];
Proof:
>> a=1:10
a =
1 2 3 4 5 6 7 8 9 10
>> a(3:end)=[]
a =
1 2
>> a=1:10
a =
1 2 3 4 5 6 7 8 9 10
>> a(11:end)=[]
a =
1 2 3 4 5 6 7 8 9 10
>>
The reason is end is 10 (although you can't use it as a stanalone variable), and 11:10 gives an empty array.

The short answer is yes, you can call a function inside a parfor.
The long answer is that parfor only works if each iteration inside the parfor is independent of the other iterations. Matlab has checks to catch when this isn't the case; though I don't know how full-proof they are. In your example, each foo() call can run independently and store its return value in a specific location of Mat that won't be written to or read by any other parfor iteration, so it should work.
This breaks down if values in Mat are being read by foo(). For example, if the parfor ran 4 iterations simultaneously, and inside each iteration, foo() was reading from Mat(1) and then writing a new value to Mat(1) that was based on what it read, the timing of the reads/writes would change the output values, and matlab should flag that.

Related

MATLAB code requires too much time for compiling

I am trying to compute this
in MATLAB but the code requires about 8 hours to compile. In particular e, Ft=[h(t);q(t)] and Omega are 2x1 matrices (e' is 1x2), Gamma is a 2x2 matrix and n=30. Can someone help me to optimize this code?
I tried in this way:
aux=[0;0];
for k=0:29
for j=1:k-1
aux=[aux Gamma^j*Omega];
end
E(t,k+1)= e'*(sum(aux,2)+Gamma^k*[h(t);q(t)]);
end
Vix=1/30*sum(E,2);
EDIT
now I changed into this and it is faster, but I am not sure that I am applying correctly the formula in the picture...
for t=2:T
% 1. compute today's volatility
csi(t) = log(SP500(t)/SP500(t-1))-r(t)+0.5*h(t);
q(t+1) = omega+rho*q(t)+phi*((csi(t)-lambda*sqrt(h(t)))^2-h(t));
h(t+1) = q(t)+alpha*((csi(t)-lambda*sqrt(h(t)))^2-q(t))+beta*(h(t)-q(t));
for k=1:30
aux=zeros(2,k);
for j=0:k-1
aux(:,j+1)=Gamma^j*Omega;
end
E(t,k)= e'*(sum(aux,2)+Gamma^k*[h(t);q(t)]);
end
end
Vix(2:end)=1/30*sum(E(2:end,:),2);
(I don't need Vix(1))
Here are some reasons I can think of:
REPEATED COPYING(No preallocation) The main reason for the long run time is the line aux=[aux Gamma^j*Omega] line, in which an array is concatenated at every loop iteration. MATLAB's debugger should have flagged this for you in its editor and should have cited that "memory preallocation" using zeros should be implemented.
Essentially, when one concatenates arrays this way, MATLAB is internally making copies of the array at every loop iteration, thus, in addition to the math operations copying is taking place. As the array grows, the copying operations become ever more expensive. This is avoided by preallocation, which consists of predefining the size of the storage array (in this case the variable aux) so that MATLAB doesn't have to keep on allocating space on the go. Try:
aux = zeros(2, 406); %Creates a 2 by 406 array. I explain how I get 406 below:
p = 0; %A variable that indexes the columns of aux
for k=0:29
for j=1:k-1
p = p+1; %Update column counter
aux(:,p) = Gamma^j*Omega; % A 2x2 matrix multiplied by a 2x1 matrix yields a 2x1.
end
E(t,k+1)= e'*(sum(aux,2)+Gamma^k*[h(t);q(t)]);
end
Vix=1/30*sum(E,2);
Now, MATLAB simply overwrites the individual elements of aux instead of copying aux, and concatenating it with Gamma^j*Omega, and then overwriting aux. Essentially, the above makes MATLAB allocate space for aux ONCE instead of 406 times. I figured out that aux ends up being a 2 by 406 array for the n=30 case in the end by running this code:
p = 0;
for k = 0:29
for j = 1:k-1
p = p + 1;
end
end
To know the final size of aux for other values of n you should see if a formula for it is available (or derive your own).
LOOPING TRANSPOSITION OF A CONSTANT?
Next, e'. As you may know, ' is the transpose operation. From your sample code, the variable e is not edited inside the for loops, yet you have the ' operator inside the outer for loop. If you perform the transpose operation once outside the outer for loop you save yourself the expense of transposing it at every loop iteration.
RUNNING TOTAL
As a final note, I would suggest replacing sum(aux,2) with a variable that keeps a running total. This is because currently, this makes MATLAB sum over the entirety of aux at every loop iteration.
Hope this helps mate.

Is it possible to change two function using parfor loop?

Suppose I have two functions written on different scripts, say function1.m and function2.m The two computation in the two functions are independent (Some inputs may be the same, say function1(x,y) and function2(x,z) for example). However, running sequentially, say ret1 = function1(x,y); ret2 = function2(x,z); may be time consuming. I wonder if it is possible to run it in parfor loop:
parfor i = 1:2
ret(i) = run(['function' num2str(i)]); % if i=1,ret(1)=function1 and i=2, ret(2)=function2
end
Is it possible to write it in parfor loop?
Your idea is correct, but the implementation is wrong.
Matlab won't let you use run within parfor as it can't make sure it's a valid way to use parfor (i.e. no dependencies between iterations). The proper way to do that is to use functions (and not scrips) and an if statement to choose between them:
ret = zeros(2,1);
parfor k = 1:2
if k==1, ret(k) = f1(x,y); end
if k==2, ret(k) = f2(x,z); end
end
here f1 and f2 are some functions that return a scalar value (so it's suitable for ret(k) and each instance of the loop call a different if statement.
You can read here more about how to convert scripts to functions.
The rule of thumb for a parfor loop is that each iteration must be standalone. More accurately,
The body of the parfor-loop must be independent. One loop iteration
cannot depend on a previous iteration, because the iterations are
executed in a nondeterministic order.
That means that every iteration must be one which can be performed on its own and produce the correct result.
Therefore, if you have code that says, for instance,
parfor (i = 1:2)
function1(iterator,someNumber);
function2(iterator,someNumber);
end
there should be no issue with applying parfor.
However, if you have code that says, for instance,
persistentValue = 0;
parfor (i = 1:2)
persistentValue = persistentValue + function1(iterator,someNumber);
function2(iterator,persistentValue);
end
it would not be usable.
Yes. It is possible.
Here's an example:
ret = zeros(2,1);
fHandles = {#min, #max};
x = 1:10;
parfor i=1:2
ret(i) = fHandles{i}(x);
end
ret % show the results.
Whether this is a good idea or not, I don't know. There is overhead to setting up the parallel processing that may or may not make it worthwhile for you.
Typically the more iterations you have computed, the more value you get from setting up a parfor loop as the iterations are sliced-up and sent non-deterministically to the separate cores for processing. So you're getting use of 2 cores right now, but if you have many functions this may improve things.
The order that the iterations are run is not guaranteed (it could be that one core gets assigned a range of values for i, but we do not know if it those values are taken in order or randomly), so your code can't depend on other iterations of the loop.
In general, the MATLAB editor is pretty at flagging these issues ahead of time.
EDIT
Here's a proof of concept for a variable number of arguments to your different functions
ret = zeros(2,1);
fHandles = {#min, #max};
x = 1:10; % x is a 1x10 vector
y = rand(20); % y is a 20x20 matrix
z = 1; % z is a scalar value
fArgs = {{x};
{y,z}}; %wrap your arguments up in a cell
parfor i=1:2
ret(i) = fHandles{i}([fArgs{i}{:}]); %calls the function with its variable sized arguments here
end
ret % show the output
Again, this is just proof-of-concept. There are big warnings showing up in MATLAB about having to broadcast fArgs across all of the cores.

parfor error "Index exceeds matrix dimensions"

The following code works, but if I change for into parfor, it gave an error
Index exceeds matrix dimensions
This is my code
a=zeros(3,1);
for t=1:2
ind=randsample(3,2)
a=pf(a,ind)
end
function a=pf(a,ind)
a(ind)=a(ind)+2;
end
How can I get this code working without the error?
You are seeing the error because you are misusing parfor in your code. You haven't read the relevant documentation enough, and you seem to believe that parfor is magic fairy dust that makes your computation faster, regardless of computation. Well, I have bad news.
Let's take a closer look at your example:
a = zeros(3,1);
% usual for
disp('before for')
for t=1:2
ind = randsample(3,2);
a = pf(a,ind);
disp(a); % add printing line
end
% parfor
disp('before parfor')
parfor t=1:2
ind = randsample(3,2);
a = pf(a,ind);
disp(a); % add printing line
end
The output:
before for
2
2
0
2
4
2
before parfor
Error: The variable a is perhaps intended as a reduction variable, but is actually an uninitialized temporary.
See Parallel for Loops in MATLAB, "Temporary Variables Intended as Reduction Variables".
As you can see, in the latter case there are no prints inside the parfor, so it doesn't even get run. See also the warning about the type of variables. The variable a is being misidentified by the execution engine because what you are doing to it doesn't make any sense.
So what to do instead? You need to formulate your problem in a way that is compatible with parfor. This will, alas, depend on what exactly you're doing to your matrix. For your specific case of incrementing random elements, I suggest that you gather the increments separately in the loop, and sum them up afterwards:
a = zeros(3,1); % only needed for size; assumed that it exists already
numiters = 2;
increments = zeros([size(a), numiters]); % compatible with a proper 2d array too
parfor t=1:numiters
ind = randsample(3,2);
% create an auxiliary increment array so that we can use a full slice of 'increments'
new_contrib = zeros(size(a));
new_contrib(ind) = 2;
increments(:,t) = new_contrib;
disp(increments(:,t)); % add printing line
end
% collect increments along last axis
a = sum(increments,ndims(increments));
disp(a)
Output:
2
0
2
2
2
0
4
2
2
Note the lack of warnings and the presence of a meaningful answer. Refactoring the loop this way transparently signals MATLAB what the variables are doing, and that increments is being filled up by independent iterations of the parfor loop. This is the way in which parfor can "speed up calculations", a very specific and controlled way that implies restrictions on the logistics used inside the loop.
n = 2;
a=zeros(3,1);
ind=zeros(3,2,n);
for ii = 1:n
ind(:,:,ii) = randsample(3,2);
end
for t=1:n
a=pf(a,ind(:,:,t));
end
function a=pf(a,ind)
a(ind)=a(ind)+2;
end
The above gets the randsample out of the loop, which is probably the issue here. Note that randsample does not support direct 3D matrix creation, so I initialised that in a loop.

Idiomatic way of exiting from multiple nested loops?

The MATLAB documentation describes the break keyword thus:
break terminates the execution of a for or while loop. Statements in the loop after the break statement do not execute.
In nested loops, break exits only from the loop in which it occurs. Control passes to the statement that follows the end of that loop.
(my emphasis)
What if you want to exit from multiple nested loops? Other languages, such as Java, offer labelled breaks, which allow you to specify where the flow of control is to be transferred, but MATLAB lacks such a mechanism.
Consider the following example:
% assume A to be a 2D array
% nested 'for' loops
for j = 1 : n
for i = 1 : m
if f(A(i, j)) % where f is a predicate
break; % if want to break from both loops, not just the inner one
else
% do something interesting with A
end
end
% <--- the break transfers control to here...
end
% <--- ... but I want to transfer control to here
What is an idiomatic way (in MATLAB) of exiting from both loops?
I would say for your original specific example, rather use linear indexing and a single loop:
%// sample-data generation
m = 4;
n = 5;
A = rand(m, n);
temp = 0;
for k = 1:numel(A)
if A(k) > 0.8 %// note that if you had switched your inner and outer loops you would have had to transpose A first as Matlab uses column-major indexing
break;
else
temp = temp + A(k);
end
end
Or the practically identical (but with less branching):
for k = 1:numel(A)
if A(k) <= 0.8 %// note that if you had switched your inner and outer loops you would have had to transpose A first as Matlab uses column-major indexing
temp = temp + A(k);
end
end
I would think that this answer will vary from case to case and there is no general one size fits all idiomatically correct solution but I would approach it in the following way depending on your problem (note these all assumes that a vectorized solution is not practical as that is the obvious first choice)
Reduce the dimensions of the nesting and use either no breaks or just one single break (i.e. as shown above).
Don't use break at all because unless the calculation of your predicate is expensive and your loop has very many iterations, those extra iterations at the end should be practically free.
Failing that set a flag and break at each level.
Or finally wrap your loop into a function and call return instead of break.
As far as I know there are no such functionality built in. However, in most cases matlab does not need nested loops due to its support for vectorization. In those cases where vectorization does not work, the loops are mostly long and complicated and thus multiple breaks would not hinder readability significantly. As noted in a comment, you would not really need a nested loop here. Vectorization would do the trick,
m = 5;
n=4;
x = rand(m,n);
tmp = find(x>0.8, 1, 'first');
if (isempty(tmp))
tmp = m*n+1;
end
tmp = tmp-1;
tot = sum(x(1:tmp));
There might of course be people claiming that for loops are not necessarily slow anymore, but the fact remains that Matlab is column heavy and using more than one loop will in most cases include looping over non optimal dimensions. Vectorized solutions does not require that since they can use smart methods avoiding such loops (which of course does not hold if the input is a row vector, so avoiding this is also good).
The best idiomatic way to use Python (or poison of your choice) and forget all this but that's another story. Also I don't agree with the vectorization claims of the other answers anymore. Recent matlab versions handle for loops pretty quickly. You might be surprised.
My personal preference goes to raising an exception deliberately and cradling it within a try and catch block.
% assume A to be a 2D array
A = rand(10) - 0.5;
A(3,2) = 0;
wreaker = MException('Loop:breaker','Breaking the law');
try
for j = 1 : size(A,1)
% forloop number 1
for i = 1 : size(A,2)
% forloop number 2
for k = 1:10
% forloop number 3
if k == 5 && j == 3 && i == 6
mycurrentval = 5;
throw(wreaker)
end
end
end
end
catch
return % I don't remember the do nothing keyword for matlab apparently
end
You can change the location of your try catch indentation to fall back to the loop of your choice. Also by slaying kittens, you can write your own exceptions such that they label the exception depending on the nest count and then you can listen for them. There is no end to ugliness though still prettier than having counters or custom variables with if clauses in my opinion.
Note that, this is exactly why matlab drives many people crazy. It silently throws exceptions in a pretty similar way and you get a nonsensical error for the last randomly chosen function while passing by, such as size mismatch in some differential equation solver so on. I actually learned all this stuff after reading a lot matlab toolbox source codes.

MATLAB - vector script

I have recently started learning MatLab, and wrote the following script today as part of my practice to see how we can generate a vector:
x = [];
n = 4;
for i = i:n
x = [x,i^2];
end
x
When I run this script I get what I expect, namely the following vector:
x = 0 1 4 9 16
However, if I run the script a second time right afterwards I only get the following output:
x = 16
What is the reason for this? How come I only get the last vector entry as output the second time I run the script, and not the vector in its entirety? If anyone can explain this to me, I would greatly appreciate it.
Beginning with a fresh workspace, i will simply be the complex number 1i (as in x^2=-1). I imagine you got this warning on the first run:
Warning: Colon operands must be real scalars.
So the for statement basically loops over for i = real(1i):4. Note that real(1i)=0.
When you rerun the script again with the variables already initialized (assuming you didn't clear the workspace), i will refer to a variable containing the last value of 4, shadowing the builtin function i with the same name, and the for-loop executes:
x=[];
for i=4:4
x = [x, i^2]
end
which iterates only one time, thus you end up with x=16
you forget to initialize i.
after first execution i is 4 and remains 4.
then you initialize x as an empty vector but because i is 4 the loop runs only once.
clear your workspace and inspect it before and after first execution.
Is it possibly a simple typo?
for i = i:n
and should actually mean
for i = 1:n
as i is (probably) uninitialized in the first run, and therefore 0, it works just fine.
The second time, i is still n (=4), and only runs once.
Also, as a performance-tip: in every iteration of your loop you increase the size of your vector, the more efficient (and more matlaboid) way would be to create the vector with the basevalues first, for example with
x = 1:n
and then square each value by
x = x^2
In Matlab, using vector-operations (or matrix-operations on higher dimensions) should be prefered over iterative loop approaches, as it gives matlab the opportunity to do optimised operations. It is also often more readable that way.