Maximum number of decision variables in scipy linear programming module in python - scipy

Is there any maximum limit for decision variables in scipy linear programming module (minimization) in python? If so, Can it be extended the number of decision variables to 10000? If scipy is limited to number of decision variables, Is there any other software which can be installed in python so that I can proceed with?

The original scipy Simplex LP solver was only for very small problems. The newer scipy Interior Point solver can handle larger problems more reliably. Also make sure to pass on A_eq and/or A_ub as sparse matrices. If you don't do this you may run out of memory.
Having said this, I would be more comfortable with LP solvers that have seen more large, sparse problems than scipy. Most LP solvers have a Python interface.
Finally, larger problems are often (but not always) more complex and it may help to use a modeling tool. This will allow you to express the problem in a more natural way than using matrices. For Python there is PuLP and Pyomo (among others). Some commercial solvers also provide excellent modeling tools.

Related

what does impact a simulation runtime in Modelica

In order to make my model simulation's in Modelica run faster am asking the following quesion :
What does impact simulation runtime in Modelica ?
i will aprecicate any help possible.
Edit: More details can be consulted from my book "Modelica by Application -- Power Systems" (URL)
What does impact the runtime performance?
I. Applied compilation techniques
Naturally, object-oriented Modelica models, even trivial ones, would correspond to a large-scale system of equations. Modelica simulation environments would usually optimize such generated models:
reduce the number of possible equations by removing trivial ones (i.e. alias equations)
decompose a large-block of equation system with so called BLT-transformation into smaller cascaded blocks of equation systems that can be solved faster in a sequential manner and not as a single block of equations,
solve s.c. large algebraic loops using tearing methods.
It can theoretically even go too far and attempt to solve blocks of equation system in an analytical manner if possible instead of conducting expensive numerical integration
Thus, the runtime performance would be influenced by the underlying Modelica compiler and how far does it exploit equation-based compiler methods. Usually some extra settings need to be activated to exploit all possible kind of such techniques. Digging the documentation to enable such settings is needed.
II. The nature of the model
The nature of the model would influence the runtime performance, particularly:
Is the model a large-scale system? or a small-scale one?
Is it strongly nonlinear or semi-linear one?
Is the resulting optimized equation system corresponding to the model sparse (i.e. large set of equations each with few number of variables, e.g. power system network models) or dense (e.g. multibody systems and biochemical networks)
Is it a stiff system? (e.g. a system with several subsystems some exhibiting very quick dynamics and others very slow dynamics)
Does the system exhibit large number of state events
...
III The choice of the solver
The mentioned characteristics of a given model would typically influence the ideal choice of the solver. The solver can largely influence the runtime performance (and accuracy). A strategy for solver choice could be made in the following order:
For a non-stiff weakly nonlinear model, the ideal choice would be an explicit method, e.g. Single-step Runga-Kutta or Multi-step Adam-Bashforth of higher order. If accuracy is less significant, one can attempt an explicit method of a lower order which would executes faster. Naturally, increasing the solver error tolerance would also speed-up the simulation.
However, it could happen, particularly for large-scale systems, that numerical stability could be more difficult to guarantee. Then, smaller solver step-sizes (and/or smaller error tolerance) for explicit solvers should be attempted. In this case, an implicit solver with larger error tolerance can be comparable with an explicit solver with a smaller tolerance.
Actually, it is wise to try both methods, comparing the accuracy of the results, and figuring out if explicit methods produce comparably accurate results. However, as a warning this would be just a heuristic, since the system does not necessarily have the same behavior over the entire space of admissible parameter values.
For increasing nonlinearity of the model, the choice would tend more towards modern solvers making use of variable step-size techniques. Here I would start with implicit variable-step Runga-Kutta (i.e. single-step) and/or the implicit variable-step multi-step methods, Adams–Moulton. For both of these classes, one can enlarge the solver tolerance and/or lower the solver error order and figure out if the simulation produces comparably accurate solutions (but with faster runtime).
Implementations of the previous classes of methods are usually less conservative with error control, and therefore, for increasing stiffness of the model or badly scalable models, the choice would tend more towards modern solvers implementing so-called numerically more stable backward differentiation formula (BDF), s.a. DASSL, CVODE, IDA. These solvers (can) also make use of the s.c. Jacobian of the system for adaptive step-size control.
A modern solver like LSODAR that switches between explicit and implicit solvers and also perform automatic error order control (switching between different orders) is a good choice if one does not know that much information about the behavior of the model. May be some Modelica environments have an advanced solver making use of automatic switching. However, if one knows the behavior of the model in advance, it is also wise to use other suggested methods since LSODAR may not perform the most optimal switching when needed.
x. ...
The comparisons between solvers from classes 3,4 and 5 are not straightforward to judge and it depends also on whether the system is continuous or hybrid, i.e. the underlying root-finding algorithms.
Usually DASSL could be slower as it is more conservative with step-size/error control. So it seems that IDA and others are faster. Some published works exist that can give some intuitions regarding such comparisons. It would be nice to have a Modelica library including all possible types of models and running all possible benchmarks w.r.t. accuracy and runtime to draw some more solver/model specific conclusions. A library that could be used and extended for such a purpose is the ScalableTestSuite Modelica library.
IV. Advanced aspects
There have been some published works in the Modelica community regarding making use of sparse solvers to exploit the expected sparsity of the Jacobian. If such a feature is provided by the simulation environment, this would usually significantly improve the runtime performance of large-scale models.
For models with massive number of events, numerical integration in the standard way can be extremely inefficient. Particularly challenging is when an event is triggered, other sets of state-events could be further triggered and a queue of state-events should be evaluated. The root-finding algorithm could further trigger other events and the solver could be hanging on in a s.c. chattering situation. There are advanced strategies for such situations, s.c. sliding mode, however I am not sure how far Modelica simulation environments are handing this issue.
One set of suggested solutions (also for systems with high degree of stiffness) is to employ so called QSS (quantized state system) methods. This would be significantly beneficial particularly for models that can not be solved using explicit solvers. There are both explicit and implicit QSS methods. There have been also other worth-to-try numerical integration strategies where only subsets of the entire equation system is evaluated when approximating a state event. Here I am not sure about availability of such solvers.
Some simulation environments differentiate between two simulation modes which can influence the simulation runtime: the ODE Mode and DAE Mode. In the first mode, the system is reduced to an ODE system with potentially additional cascaded blocks of nonlinear equation systems. In the DAE mode, the system is reduced to a DAE system of index one. The former mode would be beneficial for dense systems exhibiting such large cascaded blocks of nonlinear equations to be solved using s.c. Tearing methods instead of numerical integration. The DAE mode would be beneficial for large-scale sparse systems solved using sparse solvers. I think the ODE mode is usually activated by choosing CVODE or LSODAR while DAE mode is activated by choosing IDA or DASSL. But digging the documentation here and there is also recommended.
There are also some published works regarding so called multirate numerical integration solvers. Here, in each numerical integration step, only the numerically-significant portion of the equation system and not the entire equation system is integrated. Hence, this is significantly beneficial for large-scale stiff systems.
x. ...
V. Parallelization
Obviously, making use of multicore / GPUs for executing numerical integration in parallel, among other approaches for applying parallelization can speed-up computations.
VI. quite very advanced topics
In order to pay attention at some excellent research attempts some of which can be exploited for speeding up the simulation runtime performance of large-scale (loosely-coupled) hybrid networked models, I am listing this here as well. Speed-up can be obtained by making use of hybrid paradigms, agent-based modeling paradigm and/or multimode paradigm. The idea behind is that it is possible to describe a loosely coupled system in several smaller subsystems and conduct the communication among subsystems only when necessary. This can be beneficial and the reasons can be traced by searching for relevant publications. There have been some excellent work in some of the mentioned directions, and it is worth to continue them where they have stopped if this is the case.
Remark: Any of the mentioned solvers is not necessarily present in all possible Modelica simulation environments. If a solver is not provided as a choice, one would still be able to produce an FMU-ME (Functional mockup unit for model exchange) and write code that numerically integrate this FMU with a desired solver.
Warning: Some of the above aspects are based on personal experiences for a particular type of models and are not necessarily true for all model types.
Few suggested reading and I am definitely missing a lot of key publications:
F. Casella, Simulation of Large-Scale Models in Modelica: State of the Art and Future Perspectives, Modelica 2016
Liu Liu, Felix Felgner and Georg Frey, Comparison of 4 numerical solvers for stiff and hybrid systems simulation, Conference 2010
Willi Braun, Francesco Casella and Bernhard Bachmann, Solving large-scale Modelica models: new approaches and experimental results using OpenModelica, Modelica 2017
Erik Henningsson and Hans Olsson and Luigi Vanfretti, DAE Solvers for Large-Scale Hybrid Models, Modelica 2019
Tamara Beltrame and François Cellier, Quantised state system simulation in Dymola/Modelica using the DEVS formalism, Modelica 2006
Victorino Sanz and Federico Bergero and Alfonso Urquia, An approach to agent-based modeling with Modelica, Simpra 2010

Speed of CPLEX vs CPLEX using SCIP

I am new to LP and have only briefly used PuLP in Python.
Why is there a speed difference between SCIP 3.2.1 - CPLEX 12.63 and CPLEX 12.6.3? Doesn't SCIP still use CPLEX for solving?
Why will someone use SCIP with CPLEX solver, instead of using CPLEX directly?
What is this difference about
This plot is not showing a LP-benchmark, but a Mixed-integer programming benchmark.
Mixed-integer programming solvers typically use a branch-and-cut-based algorithm (including heuristics and co.), where a lot of relaxations are solved (in sequence; treating binary-/integer-variables as continuous resulting in an LP-problem).
One decision then is to choose how to solve these relaxed subproblems. The most simple decision (there are many more; e.g. tuning the Simplex-algorithm's parameters; it get's even more complex when solving problems with nonlinear-conic objectives) is to choose the LP-solver.
SoPlex is a LP-solver implementation by the SCIP-team. Meaning:
SCIP - SoPlex will use SCIP's algorithm for MIP (handling branching, cut-generation and co.) using SoPlex as solver for the internal LP-subproblems
SCIP - CPLEX will use SCIP's algorithm for MIP using CPLEX as solver for the internal LP-subproblems
Why using SCIP with CPLEX (instead of using a pure CPLEX approach)
The why is not that easy to explain.
Keep in mind, that all MIP-solvers are heuristics-based and on some problems SCIP will be faster than CPLEX (despite the underlying LP-solver selected).
Keywords for some theory: NP-hardness (of MIP) and the No free lunch theorem
Faster could mean: faster due to the MIP-based strategies, not the speed of the underlying LP-solver so that you may even gain an overall speedup using CPLEX on the subproblems!
The two solvers (MIP-solvers) are probably also much different in regards to parameters & accessibility (of internal algorithmic components). It's obvious, that you can tune SCIP in a much more general way than CPLEX (because it's open source)
As mattmilten mentioned in the comments: SCIP and CPLEX are also different in regards to the support of problem-classes which can be solved. One example of this might be the possibility for some special nonlinear-constraints (resulting in a MINLP). Using SCIP for these kind of problems, can still use CPLEX' LP-solver internally (same arguments as above)

How to compare gams vs matlab in optimization

Is it possible to use Matlab instead of GAMS for optimization problems? How do they compare? In other words, can every problem solved with gams be solved with some matlab toolbox And finally what is a list of the optimization tools in Matlab.
Matlab and GAMS are very different in how they approach modeling. GAMS is organized along the concept of equations (essentially an optimization model is a collection of equations). This is both for LP, MIP, MINLP and other types of models. These equations largely resemble how you would write things down in Math. Matlab views an optimization model (LP/MIP) as a matrix (or two matrices depending on whether we deal with equalities or inequalities). You have to translate your constraints in these one or two matrices by populating them. Depending on the model this can be a difficult task. For structured models it is not so bad, but for large, complex models the GAMS approach is much more natural and convenient.
NLP problems in GAMS are just like LPs: equation based. GAMS uses automatic differentiation so no need to write gradients and GAMS targets large scale NLP problems. Matlab uses functions in their NLP solvers, and these are mostly suited for smaller problems. Gradients are provided by the user.
GAMS supports many solvers. MATLAB has an optimization toolbox, but these solvers are largely targeted to smaller and medium sized models. Having said that many state-of-the-art solvers have a Matlab interface (e.g. Cplex, Gurobi).
Not all solvers available under GAMS are directly callable from Matlab but many are (sometimes using external toolboxes).

Sensitivity analysis in LP solvers from MATLAB

As far as I understand, CPLEX, LP_solve and GLPK, among other LP solvers, offer sensitivity analysis.
I have the above three solvers installed on my machine, along with these two MATLAB wrappers:
CPLEX for MATLAB API (for CPLEX)
YALMIP (a general MATLAB wrapper for several solvers)
I looked in the documentation of these two wrappers but could not find a way of running sensitivity analysis from them. Do they support it? If not, are there any LP solvers that offer MATLAB support for their sensitivity analysis?
What do I mean by sensitivity analysis?
I mean sensitivity analysis with respect to the cost function and constraints. Conceptually speaking, sensitivity analysis tries to address the following question:
How would the solution change if some aspect of the problem is
changed?
For example:
What is the range of values the coefficient for the variable j can
take without affecting the optimality of the solution?
More specifically, here is a list of the Java, C++ and C APIs that CPLEX provides for sensitivity analysis.
Here is information about the sensitivity analysis provided by LP_solve. You can find the help text for the previous link within LP_solve's main reference guide by searching for "sensitivity" here.

MATLAB vs Python for programming Probability Based Program

I am writing programs that are based on robots navigating through mazes (would involve stochastic programming).
Since it will involve heavy matrix handling (plus point for MATLAB) and simulating a robot (plus point for Prolog), I am in a dilemma between the choice of MATLAB and Prolog.
Note: I do have MATLAB at my work environment, hence cost is not an issue.
As mentioned previously, I am not sure if you are looking for comparisons between MATLAB and Python or MATLAB and Prolog. I can speak to the former, at least: MATLAB provides fast linear algebraic computation and a great IDE... and that's about it. Python will cost you much fewer headaches (and dollars), and you can manage "heavy matrix handling" nearly as easily if you tack on Numpy in particular, or SciPy in general.
Also, VPython (Visual Python) is a great 3D visualization tool that uses Numpy under the hood. I developed a robot simulator using VPython; you can see screenshots and example code (for simple wall-following maze navigation) that you can check out in a recent blog post.