Parallel computation of recursive tree structure - matlab

I am trying to build a recursive tree structure in matlab. Since it is a pretty big tree I am trying to build, I want to speed up the calculation as much as possible, which is why I want to try making the calculation in parallel.
An example of what I am trying do may look like this:
function result = minimalExample(x)
n = numel(x);
if n == 0
result = 0;
return;
end
if n==1
result = x;
return
end
average = mean(x);
result = minimalExample(x(x<average))+minimalExample(x(x>average));
if any(x==average)
result = result*average;
end
end
I tried to use parfeval to do the calculation like this:
function result = minimalExampleParallel(x,pool)
n = numel(x);
if n == 0
result = 0;
return;
end
if n==1
result = x;
return
end
average = mean(x);
f(1) = parfeval(pool,#minimalExampleParallel,1,x(x<average),pool);
f(2) = parfeval(pool,#minimalExampleParallel,1,x(x>average),pool);
result = 0;
for i = 1:2
[~,value] = fetchNext(f);
result = result + value;
end
if any(x==average)
result = result*average;
end
end
But I get an Workers cannot execute parfeval or parfevalOnAll.-error.
I was hoping there would be a way of adding jobs to a global job queue that the workers may reach too, but I haven't been able to do so.
Is this in any way possible? And if so how? And if not so why?

I know you asked this quite some time ago, but I have an answer. I too was looking for such a solution, but could not find anything. There's nothing really posted online though so I want this to be here in case anyone else wants to know how.
You can do a sort of global job scheduler. Declare a parcluster then create separate jobs on said cluster using createCommunicatingJob. Then, and this is important, set the NumWorkersRange property of each job to something less than the total number of workers, preferably dividing them equally, though you can do it however you want. (Ex: If you have 12 workers and 4 jobs, set each worker range to [1 3]). If you do not do this, then it will queue the other tasks and only perform the first, as it will allocate all workers to the first task.
This works with functions that contain parfeval statements in them and the syntax is similar to parfeval. I verified using MATLAB's paralleldemo_blackjack_parfeval function. I hope this was what you were looking for. I know it's what I was looking for!
Link: http://www.mathworks.com/help/distcomp/createcommunicatingjob.html

Related

How do I properly "slice" a 4D matrix in Matlab in a parfor loop?

I am trying to make a portion of my code run faster in MatLab, and I'd like to use parfor. When I try to, I get the following error about one of my variables D_all.
"The PARFOR loop cannot run because of the way D_all is used".
Here is a sample of my code.
M = 161;
N = 24;
P = 161;
parfor n=1:M*N*P
[j,i,k] = ind2sub([N,M,P],n);
r0 = Rw(n,1:3);
R0 = repmat(r0,M*N*P,1);
delta = sqrt(dXnd(i)^2 + dZnd(k)^2);
d = R_prime - R0;
inS = Rw_prime(find(sqrt(sum(d.^2,2))<0.8*delta),:);
if isempty(inS)
D_all(j,i,k,tj) = D_all(j,i,k,tj-1);
else
y0 = r0(2);
inC = inS(find(inS(:,2)==y0),:);
dw = sqrt(sum(d(find(sqrt(sum(d.^2,2))<0.8*delta & d(:,2)==0),:).^2,2));
V_avg = sum(dw.^(-1).*inC(:,4))/sum(dw.^(-1));
D_all(j,i,k,tj) = V_avg;
end
end
I'm not very familiar with parallel computing, and I've looked at the guides online and don't really understand how to apply them to my situation. I guess I need to "slice" D_all but I don't know how to do that.
EDIT: I think I understand that the major problem is that when using D_all I have tj and tj-1.
EDIT 2: I don't show this above, it probably would have been helpful, but I defined D_all(:,:,:,1) = V_1; where V_1 corresponds to a previous time step. I tried making multiple variables V_2, V_3, etc. for each step and replacing D_all(j,i,k,tj-1) with V_1(j,i,k). This still led to the same error I am seeing with D_all.
"Valid indices for D_all are restricted for PARFOR loops"

Speeding up matlab for loop

I have a system of 5 ODEs with nonlinear terms involved. I am trying to vary 3 parameters over some ranges to see what parameters would produce the necessary behaviour that I am looking for.
The issue is I have written the code with 3 for loops and it takes a very long time to get the output.
I am also storing the parameter values within the loops when it meets a parameter set that satisfies an ODE event.
This is how I have implemented it in matlab.
function [m,cVal,x,y]=parameters()
b=5000;
q=0;
r=10^4;
s=0;
n=10^-8;
time=3000;
m=[];
cVal=[];
x=[];
y=[];
val1=0.1:0.01:5;
val2=0.1:0.2:8;
val3=10^-13:10^-14:10^-11;
for i=1:length(val1)
for j=1:length(val2)
for k=1:length(val3)
options = odeset('AbsTol',1e-15,'RelTol',1e-13,'Events',#eventfunction);
[t,y,te,ye]=ode45(#(t,y)systemFunc(t,y,[val1(i),val2(j),val3(k)]),0:time,[b,q,s,r,n],options);
if length(te)==1
m=[m;val1(i)];
cVal=[cVal;val2(j)];
x=[x;val3(k)];
y=[y;ye(1)];
end
end
end
end
Is there any other way that I can use to speed up this process?
Profile viewer results
I have written the system of ODEs simply with the a format like
function s=systemFunc(t,y,p)
s= zeros(2,1);
s(1)=f*y(1)*(1-(y(1)/k))-p(1)*y(2)*y(1)/(p(2)*y(2)+y(1));
s(2)=p(3)*y(1)-d*y(2);
end
f,d,k are constant parameters.
The equations are more complicated than what's here as its a system of 5 ODEs with lots of non linear terms interacting with each other.
Tommaso is right. Preallocating will save some time.
But I would guess that there is fundamentally not a lot you can do since you are running ode45 in a loop. ode45 itself may be the bottleneck.
I would suggest you profile your code to see where the bottleneck is:
profile on
parameters(... )
profile viewer
I would guess that ode45 is the problem. Probably you will find that you should actually focus your time on optimizing the systemFunc code for performance. But you won't know that until you run the profiler.
EDIT
Based on the profiler output and additional code, I see some things that will help
It seems like the vectorization of your values is hurting you. Instead of
#(t,y)systemFunc(t,y,[val1(i),val2(j),val3(k)])
try
#(t,y)systemFunc(t,y,val1(i),val2(j),val3(k))
where your system function is defined as
function s=systemFunc(t,y,p1,p2,p3)
s= zeros(2,1);
s(1)=f*y(1)*(1-(y(1)/k))-p1*y(2)*y(1)/(p2*y(2)+y(1));
s(2)=p3*y(1)-d*y(2);
end
Next, note that you don't have to preallocate space in the systemFunc, just combine them in the output:
function s=systemFunc(t,y,p1,p2,p3)
s = [ f*y(1)*(1-(y(1)/k))-p1*y(2)*y(1)/(p2*y(2)+y(1)),
p3*y(1)-d*y(2) ];
end
Finally, note that ode45 is internally taking about 1/3 of your runtime. There is not much you will be able to do about that. If you can live with it, I would suggest increasing your 'AbsTol' and 'RelTol' to more reasonable numbers. Those values are really small, and are making ode45 run for a really long time. If you can live with it, try increasing them to something like 1e-6 or 1e-8 and see how much the performance increases. Alternatively, depending on how smooth your function is, you might be able to do better with a different integrator (like ode23). But your mileage will vary based on how smooth your problem is.
I have two suggestions for you.
Preallocate the vectors in which you store your results and use an
increasing index to populate them into each iteration.
Since the options you use are always the same, instantiate then
outside the loop only once.
Final code:
function [m,cVal,x,y] = parameters()
b = 5000;
q = 0;
r = 10^4;
s = 0;
n = 10^-8;
time = 3000;
options = odeset('AbsTol',1e-15,'RelTol',1e-13,'Events',#eventfunction);
val1 = 0.1:0.01:5;
val1_len = numel(val1);
val2 = 0.1:0.2:8;
val2_len = numel(val2);
val3 = 10^-13:10^-14:10^-11;
val3_len = numel(val3);
total_len = val1_len * val2_len * val3_len;
m = NaN(total_len,1);
cVal = NaN(total_len,1);
x = NaN(total_len,1);
y = NaN(total_len,1);
res_offset = 1;
for i = 1:val1_len
for j = 1:val2_len
for k = 1:val3_len
[t,y,te,ye] = ode45(#(t,y)systemFunc(t,y,[val1(i),val2(j),val3(k)]),0:time,[b,q,s,r,n],options);
if (length(te) == 1)
m(res_offset) = val1(i);
cVal(res_offset) = val2(j);
x(res_offset) = val3(k);
y(res_offset) = ye(1);
end
res_offset = res_offset + 1;
end
end
end
end
If you only want to preserve result values that have been correctly computed, you can remove the rows containing NaNs at the bottom of your function. Indexing on one of the vectors will be enough to clear everything:
rows_ok = ~isnan(y);
m = m(rows_ok);
cVal = cVal(rows_ok);
x = x(rows_ok);
y = y(rows_ok);
In continuation of the other suggestions, I have 2 more suggestions for you:
You might want to try with a different solver, ODE45 is for non-stiff problems, but from the looks of it, it might seem like your problem could be stiff (parameters have a different order of magnitude). Try for instance with the ode23s method.
Secondly, without knowing which event you are looking for, maybe it is possible for you to use a logarithmic search rather than a linear one. e.g. the Bisection method. This will severely cut down on the number of times you have to solve the equation.

Basic structure of a for loop

I am trying to write a MATLAB function that accepts non-integer, n, and then returns the factorial of it, n!. I am supposed to use a for loop. I tried with
"for n >= 0"
but this did not work. Is there a way how I can fix this?
I wrote this code over here but this doesn't give me the correct answer..
function fact = fac(n);
for fact = n
if n >=0
factorial(n)
disp(n)
elseif n < 0
disp('Cannot take negative integers')
break
end
end
Any kind of help will be highly appreciated.
You need to read the docs and I would highly recommend doing a basic tutorial. The docs state
for index = values
statements
end
So your first idea of for n >= 0 is completely wrong because a for doesn't allow for the >. That would be the way you would write a while loop.
Your next idea of for fact = n does fit the pattern of for index = values, however, your values is a single number, n, and so this loop will only have one single iteration which is obviously not what you want.
If you wanted to loop from 1 to n you need to create a vector, (i.e. the values from the docs) that contains all the numbers from 1 to n. In MATLAB you can do this easily like this: values = 1:n. Now you can call for fact = values and you will iterate all the way from 1 to n. However, it is very strange practice to use this intermediary variable values, I was just using it to illustrate what the docs are talking about. The correct standard syntax is
for fact = 1:n
Now, for a factorial (although technically you'll get the same thing), it is clearer to actually loop from n down to 1. So we can do that by declaring a step size of -1:
for fact = n:-1:1
So now we can find the factorial like so:
function output = fac(n)
output = n;
for iter = n-1:-1:2 %// note there is really no need to go to 1 since multiplying by 1 doesn't change the value. Also start at n-1 since we initialized output to be n already
output = output*iter;
end
end
Calling the builtin factorial function inside your own function really defeats the purpose of this exercise. Lastly I see that you have added a little error check to make sure you don't get negative numbers, that is good however the check should not be inside the loop!
function output = fac(n)
if n < 0
error('Input n must be greater than zero'); %// I use error rather than disp here as it gives clearer feedback to the user
else if n == 0
output = 1; %// by definition
else
output = n;
for iter = n-1:-1:2
output = output*iter;
end
end
end
I don't get the point, what you are trying to do with "for". What I think, what you want to do is:
function fact = fac(n);
if n >= 0
n = floor(n);
fact = factorial(n);
disp(fact)
elseif n < 0
disp('Cannot take negative integers')
return
end
end
Depending on your preferences you can replace floor(round towards minus infinity) by round(round towards nearest integer) or ceil(round towards plus infinity). Any round operation is necessary to ensure n is an integer.

Break vector into pieces and parallelly applying function on them in Matlab?

I am new to matlab and I do not know how to vectorize the following:
I have a large vector (think 30000) and I want to partition it into pieces of unequal length, specified by row indices into the vector. I have a function, which I want to apply to said peices parallelly(using parfor or otherwise), and stitch back the results.
Is there an efficient way to do this? any pointers will help.
First, run parpool to initialize a parallel pool in MATLAB (you need the parallel processing toolbox) to get some workers. Then use parfor to run a for loop in parallel by having each worker execute the loop at the same time. There are a few rules, such as each iteration in the loop cannot depend on or use results from the previous iteration.
Consider the following code:
% Run parpool first
n = 100000;
data = (1:n)';
myIndices = 1:5:n;
numSections = length(myIndices) -1;
f = #(x) mean(x);
outputMatrix = zeros(numSections,1);
% TRy changing this to parfor or just for and run a few times to see
% average time:
tic
parfor ind = 1:numSections
if ind == 1
myStart = 1;
else
myStart = myIndices(ind)+1;
end
myEnd = myIndices(ind+1);
outputCell{ind} = f(data(myStart:myEnd));
outputMatrix(ind) = f(data(myStart:myEnd));
end
toc
% convert cell array to matrix
output = cell2mat(outputCell);
Here I show how to collect in a cell or a vector/matrix. It depends on what kind of function you are running on your data. Try changing the parfor to for and running a few times to see the speed difference.
I chose to divide the data into even blocks of size 5 but you could change this to be whatever you want by making myIndices be arbitrary values.

Storing Results of a Operation in a Matrix

Let's say I want to take the sin of 1 through 100 (in degrees).
I come from a C background so my instinct is to loop 1 through 100 in a for loop (something I can do in Matlab). In a matrix/vector/array I would store sin(x) where x is the counter of the for loop.
I cannot figure out how to do this in Matlab. Do I create a array like
x = [1 .. 100];
And then do
x[offset] = numberHere;
I know the "correct" way. For operations like addition you use .+ instead of + and with a function like sin I'm pretty sure you just do
resultArray = sin(x);
I just want to know that I could do it the C way in case that ever came up, thus my question here on SO. :)
% vectorized
x = sin((1:100)*pi/180);
or
% nonvectorized
x=[];
for i = 1:100
x(i) = sin(i*pi/180);
end
I beleive this can actually be done as a one liner in MatLab:
x = sind(1:100);
Note that you use sind() instead of sin(). Sin() takes radians as arguments.
As others have already pointed out there are for-loops in MATLAB as well.
help for
should give you everything you need about how it works. The difference from C is that the loop can go over objects and not only an integer:
objects = struct('Name', {'obj1', 'obj2'}, 'Field1', {'Value1','Value2'});
for x = objects
disp(sprintf('Object %s Field1 = %d', x.Name, x.Field1))
end
That example will output:
Object obj1 Field1 = Value1
Object obj2 field1 = Value2
This could have been done as
for i=1:length(objects)
x = objects(i);
disp(sprintf('Object %s Field1 = %d', x.Name, x.Field1))
end
And now to what I really wanted to say: If you ever write a for loop in MATLAB, stop and think!. For most tasks you can vectorize the code so that it uses matrix operations and builtin functions instead of looping over the data. This usually gives a huge speed gain. It is not uncommon that vectorized code executes 100x faster than looping code. Recent versions of MATLAB has JIT compilation which makes it less dramatic than before, but still: Always vectorize if you can.
#Daniel Fath
I think you'll need the final line to read
resultArray(i) = sin(x(i)) (rather than x(1))
I think you can also do:
for i = x
...
though that will behave differently if x is not a simple 1-100 vector
Hmm, if understand correctly you want a loop like structure
resultArray = zeros(1,length(x)) %% initialization aint necessary I just forgot how you dynamically add members :x
for i = 1:length(x) %% starts with 1 instead of zero
resultArray(i) = sin(x(i))
end
Warning I didn't test this but it should be about right.