Running multiple instances of Google OR Solver (CBC) results in no solution found (C++) - or-tools

I am using the Google OR tools (CBC) for solving Mixed Integer Programming problem in C++.
I am following the sample code shown in the Google OR website https://developers.google.com/optimization/mip/integer_opt
Only difference is that I have a bunch of threads(pthreads)/threadpool that are trying to call solver (CBC) simultaneously. All of the threads have thread local data, and hence are calling the solver on their local data simultaneously. This also means that I am ensuring that the constraints, MPSolver, etc are all thread local (none of them are global)
Problem:
If 'n' threads call the solver simultaneously, i see that the solver reports "No Solution Found" for all the thread local datasets. However, if the whole process is sequential i.e. different datasets are solved one after another by limiting the number of threads to 1, then I get perfect optimal solution for each and every datasets.
Now, this happens only when the time limit has been set. The time limit is set to 5s using the api solver.SetTimeLimit(). I can't avoid setting the time limit because there can be cases where it might take huge amount of time to get the optimal solution. Different threads can have different data sets which can be varying in size (number of constraints and number of variables).
'n' varies from 2 to 32
Note:
Just to re-clarify I am trying to invoke simultaneously 'n' solver calls using 'n' threads (each thread calls it's own solver.solve()).
I am not trying to distribute the work of a single solver among 'n' threads.

Related

cplex from MATLAB using parfor

I have a fairly large scale optimization problem although the problem itself is fairly simple. It is just quadratic + linear objective, with linear constraints. So the problem is solvable with cplexqp. The scale of the problem is around 1300 variables, but I need to solve ~200 independent problems.
If I just loop over 200 times and call cplexqp as usual, it takes about 16 minutes to solve all the problems. I considered using parallel computing, so I changed the loop to parfor, and it now takes around 14 minutes. I would have thought we would get much bigger speedup factor, considering that we have 12 cores and 12 workers.
I made sure that the parallel worker is already initialized (so MATLAB does not have to spend time initializing them). I also verified that all 12 worker threads were active in task manager, and they all were using non trivial amount of CPU each.
My question is: do you think cplexqp has a locking mechanism, as in it can't be called with more than one problem at a given time (from different threads?) What if I have different MATLAB processes? (For example I can save the inputs to a file, and start up several MATLAB sessions to consume the file and each session would know which index of problems to solve).
16 minutes is not bad, but we may need to do this several times a day (with potentially different inputs), so I was wondering if we can speed up the process even more.
TIA
The problem is that by default CPLEX will use all cores on your machine to solve one problem. So if you attempt to solve multiple problems in parallel then you are heavily oversubscribing the CPUs. This is likely to result in an overall slowdown.
So you should carefully select how many models you solve in parallel and how many cores you allow for each solve. If you use parfor then you should use the Cplex.Param.threads parameter to limit he number of cores for a single solve, or alternatively, select the simplex algorithm to solve your QPs.
Whether this whole parallelization gives you an overall speedup depends on how much slowdown you will observe for the individual models by limiting the thread counts.

Number of workers in Matlab's parfor

I am running a for loop using MATLAB's parfor function. My CPU's specs are
I set preferred number of workers to 24. However, MATLAB sets this number to 6. Is number of workers bounded by the number of cores or by (number of cores)x(number of processors=6x12?
Matlab prefers to limit the number of workers to the number of cores (six in your case).
Your CPU (intel i7-9750H) has hyperthreading, i.e. you can run multiple (here 2) threads per core. However, this is of no use if you want to run them under full-load, which means that there is simply no resources available to switch to a different task (what the additional threads effectively are).
See the documentation.
Restricting to one worker per physical core ensures that each worker
has exclusive access to a floating point unit, which generally
optimizes performance of computational code. If your code is not
computationally intensive, for example, it is input/output (I/O)
intensive, then consider using up to two workers per physical core.
Running too many workers on too few resources may impact performance
and stability of your machine.
Note that Matlab needs to stream data to every core in order to run the distributed code. This is some kind of initialization effort and the reason why you won't be able to cut the runtime in half if you double the number of cores/workers. And that is also the explanation why there is no use for Matlab to make use of hyperthreading. It would just mean to increase the initial streaming effort without any speed-up -- in fact, the core would probably force matlab to save intermediate results and switch to the other task from time to time... which is the same task as before;)

Parallel computing data extraction from a SQL database in Matlab

In my current setup I have a for loop in which I extract different type of data from a SQL database hosted on Amazon EC2. This extraction is done in the function extractData(variableName). After that the data gets parsed and stored as a mat file in parsestoreData(data):
variables = {'A','B','C','D','E'}
for i = 1:length(variables)
data = extractData(variables{i});
parsestoreData(data);
end
I would like to parallelize this extraction and parsing of the data and to speed up the process. I argue that I could do this using a parfor instead of for in the above example.
However, I am worried that the extraction will not be improved as the SQL database will get slowed down when multiple requests are made on the same database.
I am therefore wondering if Matlab can handle this issue in a smart way, in terms of parralelization?
The workers in parallel pool running parfor are basically full MATLAB processes without a UI, and they default to running in "single computational thread" mode. I'm not sure whether parfor will benefit you in this case - the parfor loop simply arranges for the MATLAB workers to execute the iterations of your loop in parallel. You can estimate for yourself how well your problem will parallelise by launching multiple full desktop MATLABs, and set them off running your problem simultaneously. I would run something like this:
maxNumCompThreads(1);
while true
t = tic();
data = extractData(...);
parsestoreData(data);
toc(t)
end
and then check how the times reported by toc vary as the number of MATLAB clients varies. If the times remain constant, you could reasonably expect parfor to give you benefit (because it means the body can be parallelised effectively). If however, the times decrease significantly as you run more MATLAB clients, then it's almost certain that parfor would experience the same (relative) slow-down.

Matlab Segmentation Violation and Memory Assertion Failure

I am running multiple Matlab jobs in parallel on an Sun Grid Engine that is using Matlab 2016b. On my personal macbook I am running Matlab 2016a. The script is doing some MRI image processing, where each job uses a different set of parameters so that I can do parameter optimization for my image processing routine.
About half of the jobs crash however, either due to segmentation violations, malloc.c memory assertion failures ('You may have modified memory not owned by you.') or errors from HDF5-DIAG followed by a segmentation violation.
Some observations
The errors do not always occur in the same jobs or in the same
functions, but the crashes occur in several groups of jobs, where the jobs within one group crash within one minute of another.
I am not using dynamic arrays anymore but preallocate my
arrays. If the arrays turn out to be too small I extend them with
for example cat(array, zeros(1, 2000)).
The jobs use partly the
same computations so they can share data. I do this by first
checking wether the data is already generated by another job. If so
try to load it using a while loop with a maximum number of attempts
and pauses of 1 second (since it might fail when another job is
still writing to the file, if it waits a bit and retries it might
succeed). If the loading fails after the maximum number of attempts
or if the data does not exist yet, then this job performs the
required computations and tries to save the data. If the data was
saved by another job in the meantime then this job does not save the
data anymore.
I am not using any C/C++ or MEX files.
I have tested a subset of some of the jobs on my own laptop with Matlab 2016a and on a linux computer with Matlab 2016b, those worked fine. But again, the problem occurs only after a few hundred (of the total 500 iterations), and I didn't run the full simulation on my own computer but only around 20 iterations due to time constraints.

Faster way to run simulink simulation repeatedly for a large number of time

I want to run a simulation which includes SimEvent blocks (thus only Normal option is available for sim run) for a large number of times, like at least 1000. When I use sim it compiles the program every time and I wonder if there is any other solution which just run the simulation repeatedly in a faster way. I disabled Rebuild option from Configuration Parameter and it does make it faster but still takes ages to run for around 100 times.
And single simulation time is not long at all.
Thank you!
It's difficult to say why the model compiles every time without actually seeing the model and what's inside it. However, the Parallel Computing Toolbox provides you with the ability to distribute the iterations of your model across several cores, or even several machines (with the MATLAB Distributed Computing Server). See Run Parallel Simulations in the documentation for more details.