Julia Multicore - Multicore code has strange errors - multicore

I wrote the following code, and it makes sense to me but it is outputting a bunch of random stuff and also going slower than a single core implementation. The way the code works is it creates a vector of remote references with the size equal to the number of cores on the computer and then it takes the integral that needs to be done and divides it up into segments such that the number of integrals to be done is equal to the number of cores. It then assigns a remote reference to each of the integrals and allows them to process each segment on a different core. These are then combined at the end in the summation loop. Without further ado here is the offending code.
function multiquadgk(F,xMin,xMax)
procs=nworkers()
results=Array(RemoteRef,1,procs)
bnds=linspace(xMin,xMax,procs+1)
for i = 1:(procs)
#println("Starting Process")
#println(i)
results[i]=#spawn quadgk(F,bnds[i],bnds[i+1])[1]
end
sum1=0
for i=results
#println("Results for Each integration step")
#println(fetch(i))
sum1+=fetch(i)
end
return sum1
end
For some reason it include random output such as:
Loading help data...
From worker 2: Loading help data...
From worker 3: Loading help data...
From worker 3: 0196.290571870170
From worker 3: 98.145285935085
From worker 3: 196.29057187017
From worker 2: 0196.290571870170
From worker 2: 98.145285935085
From worker 2: 196.29057187017
From worker 3: -.0005038237605425529
From worker 3: .3255338699535369
From worker 3: 324.8745134499407
From worker 3:
What does this output mean and how do I silence it once I my code works?
Also it seems that the code seems be just be executing the exact same thing twice on both cores...I can't see any reason why that would be happening.
Also any tips on methods for multicore techniques would be awesome.
Thanks

Related

Abort execution of parsim

For the use case of being able to abort parallel simulations with a MATLAB GUI, I would like to stop all scheduled simulations after the user pressed the Stop button.
All simulations are submitted at once using the parsim command, hence something like a callback to my GUI variables (App Designer) would be the most preferable solution.
Approaches I have considered but were not exactly providing a desirable solution:
The Simulation Manager provides the functionality to close simulations using its own interface. If I only had the code its Stop button executes...
parsim uses the Simulink.SimulationInput class as input to run simulations, allowing to modify the preSimFcn at the beginning of each simulation. I have not found a way to "skip" the simulation at its initialization phase apart from intentionally throwing an error so far.
Thank you for your help!
Update 1: Using the preSimFcn to set the the termination time equal to the start time drastically reduces simulation time. But since the first step still is computed there has to be a better solution.
simin = simin.setModelParameter('StopTime',get_param(mdl,'StartTime'))
Update 2: Intentionally throwing an error executing the preSimFcn, for example by setting it to
simin = simin.setModelParameter('SimulationCommand','stop')
provides the shortest termination times for me so far. Though, it requires catching and identifying the error in the ErrorMessageof the Simulink.SimulationOutput object. As this is exactly the "ugly" implementation I wanted to avoid, the issue is still active.
If you are using 17b or later, parsim provides an option to 'RunInBackground'. It returns an array of Future objects.
F = parsim(in, 'RunInBackground', 'on')
Please note that is only available for parallel simulations. The Simulink.Simulation.Future object F provides a cancel method which will terminate the simulation. You can use the fetchOutputs methods to fetch the output from the simulation.
F.cancel();

Celery setup and teardown tasks

I am trying to use celery to parallelise the evaluation of a function with different parameters.
Here is a pseudo-code of why I am trying to achieve, which assumes that there is a function called evaluate decorated with #app.task
# 0. Setup cluster, celery or whatever parallelisation backend
pass
# 1. Prepare each node to simulate, this means sending some files
for node in mycluster:
#send files to node
pass
# 2. Evaluation phase
gen = Generator() # A Generator object creates parameter vectors that need to be evaluated
while not gen.finished():
par_list = gen.generate()
asyncs = []
for p in par_list:
asyncs.append(evaluate.delay(p))
results = [-1 for _ in par_list]
for i, pending in enumerate(asyncs):
if not pending.ready():
pending.wait()
if pending.successful():
results[i] = pending.get()
else:
pass # manage error
# send results to generator so that it generates a new set of parameters later
gen.tell(results)
# 3. Teardown phase
for node in mycluster:
#tell node to delete files
pass
The problems with this approach is that if my main application is running, and it has already passed the setup phase, then when new node connects, it certainly will not pass the setup phase. Similarly, the teardown phase will not be executed if a node disconnects.
A couple of solutions come to mind:
Instead of using a setup phase, chain two functions so that each node does setup | evaluate | teardown for each iteration of the "2. evaluation phase" loop. The problem here is that sending files through the message queue is something that I would like to avoid as much as possible.
Configure the workers to have a setup and teardown task so that they are automatically ready when they connect. I tried using bootsteps.StartStopStep , but I am not sure if this is the right way to go.
Setup a distributed file system so that there is no need to prepare and delete files before and after the evaluations
The concrete question here is, what's the recommended approach for these kind of tasks? I am sure that this is not a convoluted use-case and maybe one of you can provide some guidance on how should I approach this.
I'm not sure this is a worker issue - remember you may have a host of workers on a node. This sounds more like a node initialization issue. Why not have a job (a systemd task, an init script, whatever) that runs before the celery workers and which copies the files over. Similarly, in reverse, for tear down.

Force parfor to respect some order

I understand that some indeterminism stems from parfor's parallel nature but I don't understand why it should be entirely random. Is there any way to force parfor to respect (at least loosely) the order of the loop? More specifically I would like that in the case of:
parfor i=1:100
do_independent_stuff()
end
each worker of the pool when asking for a new task (i.e. in this case a new iteration of the loop) to be affected the lowest i that hasn't been computed or affected to a worker yet.
I think its by design that running something in parallel assumes that order is not important, at least in Matlab. Each thread/worker should be independent of each other. However, as indicated in this question, you could try job and task control interface to give you some level of control.
Firstly, in practice, PARFOR isn't "entirely random" - you can easily observe that it sends out chunks of loop iterates in reverse order. In R2013b and later, if you need more control over ordering (if, for example, you know that certain of your independent things are likely to take a long time, and therefore wish to start computing them first), you can use PARFEVAL.
If you need to loosely synchronize things, for instance wait until some thread as finished or has reach some point before to start another one, best should be to use semaphores, locks, mutex, etc...
I don't know if 'Parallel toolbox' includes such synchronization objects, but here is some workaround to create semaphore for instance:
https://stackoverflow.com/a/22874669/684399
You can also use objects in 'System.Threading' namespace (requires .NET):
Init:
someResultAvailable = System.Threading.ManualResetEvent(false);
In some job:
... do work ...
someResultAvailable .Set();
... continue ...
In another one:
... do work ...
if (!someResultAvailable.WaitOne(10000))
{
error('Timeout waiting for result from other thread');
}
... continue now knowing that results are available ...

Implementing a priority queue in matlab in order to solve optimization problems using BRANCH AND BOUND

I'm trying to code a priority queue in MATLAB, I know there is the SIMULINK toolbox for priority queue, but I'm trying to code it in MATLAB. I have a pseudo code that uses priority queue for a method called BEST First Search with Branch and Bound. The branch and bound algorithm design strategy is a state space tree and it is used to solve optimization problems. simple explanation of what is branch and bound
I have read chapter 5: Branch and Bound from a book called 'FOUNDATIONS OF ALGORITHMS', it's the 4th edition by Richard Neapolitan and Kumarss Naimipour , and the text is about designing algorithms, complexity analysis of algorithms, and computational complexity (analysis of problems), very interesting book, and I came across this pseudocode:
Void BeFS( state_space_tree T, number& best)
{
priority _queue-of_node PQ;
node(u,v);
initialize (PQ) % initialize PQ to be empty
u=root of T;
best=value(v);
insert(PQ,v) insert(PQ,v) is a procedure that adds v to the priority queue PQ
while(!empty(PQ){ % remove node with best bound
remove(PQ,v);
remove(PQ,v) is a procedure that removes the node with the best bound and it assigns its value to v
if(bound(v) is better than best) % check if node is still promising
for (each child of u of v){
if (value (u) is better than best)
(best=value(u);
if (bound(u) is better than best)
insert(PQ,u)
}
}
}
I don't know how to code it in matlab, and branch and bound is an interesting general algorithm for finding optimal solutions of various optimization problems, especially in discrete and combinatorial optimization, instead of using heuristics to find an optimal solution, since branch and bound reduces calculation time and finds the optimal solution faster.
EDIT:
I have checked everywhere whether a solution already has been implemented , before posting a question here. And I came here to get ideas of how I can get started to implement this code
I have included this in your post so people can know better what you expect of them. However, 'ideas to get started to implement' is still not much more specific than 'how to write code in matlab'.
However, I will still try to answer:
Make the structure of the code, write the basic loops and fill them with comments of what you want to do
Pick (the easiest or first) one of those comments, and see whether you can make it happen in a few lines, you can test it by generating some dummy input for that piece of code
Keep repeating step 2 untill all comments have the required code
If you get stuck in one of the blocks, and have searched but not found the answer to a specific question. Then this is not a bad place to ask.

Silence or condense IPython parallel exceptions

Is it possible to silence the details of a composite exception containing the errors IPython parallel workers? I have a large cluster (500+ workers) and if my (bad) code throws an exception on all workers, it takes forever for the exception to parse and render in the IPython Notebook. I'd like to just silence the details of the worker errors and get one, simple tiny exception back with the details from a single worker since the rest tend to be the same in my usage.
I know I can switch my DirectView to point to one worker to test my code, but I'd be handy not to manipulate the dview and instead just set a global flag to avoid giant stack traces.
Step 1: ask this question
Step 2: checkout this Pull Request
If you just want to see the first exception, you can register a custom exception handler that does exactly that:
from IPython.parallel import error
def only_the_first(self, etype, value, tb, tb_offset=None):
value.print_traceback(0)
ip = get_ipython()
ip.set_custom_exc((error.CompositeError, ), only_the_first)