Efficient formulations for reservoir constraints in CP solvers - or-tools

I am fairly new to constraint programming, so apologies for any misconceptions.
Assume the following example problem:
A pump is moving water from a bottomless lake into a tightly constrained reservoir at a set rate. There is demand on the reservoir over the day with each hour having different demand. The goal is to ensure the reservoir stays within limits by turning the pump on and off.
Example data for a 6hr problem:
res_bounds = [0, 20]
demand = [0,8,1,13,11,12]
A feasible solution would be:
pump_schedule = [1,1,0,1,1,1]
Giving a volume profile of
[10,12,11,8,7,5]
When modelling the above example with a MIP, the "standard" formulation is discretized 1-hour steps with an underlying mass-balance model. These models, however, tend to scale poorly when the time steps become too fine or the size of the problem gets out of hand.
At first glance, CP offers a solution with features such as interval variables for efficiently scheduling the pump at a very fine resolution. However, tracking the reservoir levels proves a challenge as the model has no visibility of the levels in the middle of the pump running interval.
Google's or-tools offers AddReservoirConstraintWithActive to model the problem exactly as I described it, but seems to require discrete boolean time variables, running into the same issue as the MIP solvers.
In short, is there a way in constraint programming to model such reservoir constraints without boolean time variables? I tried looking around online but found nothing that would help so far.
Thanks in advance.

Related

How to test whether the ODE integration has reached equilibrium?

I am using Matlab for this project. I have introduced some modifications to the ode45 solver.
I am using sometimes up to 64 components, all in the [0,1] interval and the components sum up to 1.
At some intervals I halt the integration process in order to run a quick check to see whether further integration is needed and I am looking for some clever way to efficiently figure this one.
I have found four cases and I should be able to detect each of them during a check:
1: The system has settled into an equilibrium and all components are unchanged.
2: Three or more components are wildly fluctuating in a periodic manner.
3: One or two components are changing very rapidly with low amplitude and short frequency.
4: None of the above is true and the integration must be continued.
To give an idea: I have found it to be a good practice to use the last ~5k states generated by the ode45 solver to a function for this purpose.
In short: how does one detect equilibrium or a nonchanging periodic pattern during ODE integration?
Steady-state only occurs when the time derivatives your model function computes are all 0. A periodic solution like you described corresponds rather to a limit cycle, i.e. oscillations around an unstable equilibrium. I don't know if there are methods to detect these cycles. I might update my answers to give more info on that. Maybe an idea would be to see if the last part of the signal correlates with itself (with a delay corresponding to the cycle period).
Note that if you are only interested in the steady state, an implicit method like ode15s may be more efficient, as it can "dissipate" all the transient fluctuations and use much larger time steps than explicit methods, which must resolve the transient accurately to avoid exploding. However, they may also dissipate small-amplitude limit cycles. A pragmatic solution is then to slightly perturb the steady-state values and see if an explicit integration converges towards the unperturbed steady-state.
Something I often do is to look at the norm of the difference between the solution at each step and the solution at the last step. If this difference is small for a sufficiently high number of steps, then steady-state is reached. You can also observe how the norm $||frac{dy}{dt}||$ converges to zero.
This question is actually better suited for the computational science forum I think.

What does Default Solver Iteration Means?

I'm trying to understand Unity Physics engine (PhysX), Can somebody explain that what exactly Default Solver Iterations and Default Solver Velocity Iterations are?
This is from Unity documentation :
Default Solver Iterations: Define how many solver processes Unity runs
on every physics frame. Solvers are small physics engine tasks which
determine a number of physics interactions, such as the movements of
joints or managing contact between overlapping Rigidbody components.
This affects the quality of the solver output and it’s advisable to
change the property in case non-default Time.fixedDeltaTime is used,
or the configuration is extra demanding. Typically, it’s used to
reduce the jitter resulting from joints or contacts.
Please provide some example of how it works and how does increase or decreasing it affects the final result?
I asked this question on Unity Forum and Hyblademin answered it:
In mathematics, an iterative solution method is any algorithm which
approximately solves a system of unknown values like [x1, x2, x3 ...
xn] by repeating a set of steps (iterating). Often, the system of
interest is a set of linear equations exactly like those seen in
algebra class but with a prohibitively high number of unknowns.
Starting with a guess for the solution to each unknown, which could be
based on a similar, known system or could be from a common starting
point like [1, 1, 1 ... 1], a procedure is carried out which gives an
approximate solution which will be closer to the exact values. After
only one iteration, the approximation won't be a very good one unless
the initial guess was already close. But the procedure can be repeated
with the first approximation as the new input, which will give a
closer approximation.
After repeating a few more times, we can expect a reliable
approximation. It still isn't exact, which we could confirm by just
plugging in our answers into our original system and seeing that it
isn't quite right (after simplifying, we would end up with things like
10=10.001 or something to that effect). That said, if the
approximation is close enough for our application, we stop iterating
and use it.
These lecture notes courtesy of a Notre Dame course give a nice
example of this in action using the well-known Jacobi method. Carrying
out an iteration of an iterative method outputs an approximation that
is better than the input because the methods are defined in a way that
causes this to happen, and this is a property called convergence. When
looking at why any given method converges, things get abstract pretty
quickly. I think this is outside the scope of your question,
especially since I don't know what method(s) Unity uses anyway.
When physics is calculated in Unity, we end up with a lot of systems
of equations. We could draw a free-body diagram to show forces and
torques during a collision for a given FixedUpdate in a Unity runtime
to show this. We could try to solve them "directly", which means to
use logical relationships to determine the exact results of the values
(like solving for x in algebra class), but even if the systems are on
the simple side, doing a lot of them will slow the execution to a
crawl. Luckily, iterative, "indirect" methods can be used to get a
pretty good approximation at a fraction of the computing cost.
Increasing the number of iterations will lead to more precise
approximate solutions. There is a point where increasing the number of
iterations gives an increase in precision that is not at all worth the
processing overhead of doing another iteration. But the number of
iterations for this point depends on what you need your project to do.
Sometimes a given arrangement of physics objects will result in jitter
with the default settings that might be improved with more solver
iterations, which is mentioned in the manual entry. There isn't a
great way to determine if changing solver iteration counts will
improve behavior or performance in the way that you need, except for
just trial and error (use the Profiler for a more-objective indication
of performance impact).
https://forum.unity.com/threads/what-does-default-solver-iteration-means.673912/#post-4512004

Discrete and Dynamic system Matlab

I'm using recusive least squares (RLS) to identify system parameters for a dynamical system. The RLS algorithm is implemented in discrete time, while the real system is continuous. In practice this is easily done, but how can I simulate these two together? A sequential solution doesn't help, since I want to use the RLS estimate to influence the system input.
The built-in event-triggering can only stop integration, if I got that right. Thus, I'd have to stop at each sampling point of the RLS algorithm and then solve the ode between samples. -> How is this implemented in Simulink?
The only real solution I found was to implement my own RK45 with adaptive step size. It is designed to take discrete and continuous systems (ode and difference equations) and solves with adaptive step size until a new sample has to be taken. This method works like a charm - with slow dynamics only the discrete points are sampled for sufficiently small sampling times and fast dynamics yield small integration step sizes, as expected!
Also the implementation was way less effort than expected and compares surprisingly well to matlabs ode45, ie. lower computational cost, higher accuracy, less oscillations after discrete jumps in the ode!

Neural Networks back propogation

I have gone through neural networks and have understood the derivation for back propagation almost perfectly(finally!). However, I had a small doubt.
We are updating all the weights simultaneously, so what is the guarantee that they lead to a smaller cost. If the weights are updated one by one, it would definitely lead to a lesser cost and it would be similar to linear regression. But if you update all the weights simultaneously, might we not cross the minima?
Also, do we update the biases like we update the weights after each forward propagation and back propagation of each test case?
Lastly, I have started reading on RNN's. What are some good resources to understand BPTT in RNN's?
Yes, updating only one weight at the time could result in decreasing error value every time but it's usually infeasible to do such updates in practical solutions using NN. Most of today's architectures usually have ~ 10^6 parameters so one epoch for every parameter could last enormously long. Moreover - because of nature of backpropagation - you usually have to compute loads of different derivatives in order to compute derivative with respect to a parameter given - so you will waste a lot of computations when using such approach.
But the phenomenon which you mention has been noticed a long time ago and there are some ways in dealing with it. There are two most common issues connected with it:
Covariance shift: it's when error and weight updates of a layer given strongly depends on output from previous layer, so when you update it - the results in the next layer might be different. The most common way to deal with this problem right now is Batch normalization.
Nolinear function vs Linear Differentation: it's quite uncommon when you think about BP but derivative is a linear operator which might generate a lot of problems in gradient descent. The most countintuitive example is the fact that if you multiply your input by a constant then every derivative will also be multiplied by the same number. This may lead to a lot of problems but most of recent methods of learning do a great job in dealing with it.
About BPTT I stronly recomend you Geoffrey Hinton course about ANN and especially this video.

particle swarm optimization inertia factor

i am reading in soft computing algorithms ,currently in "Particle Swarm Optimization ",i understand the technique in general but ,i stopped at mathematical or physics part which i can't imagine or understand how it works or how it affect the flying,that part is the first part in the equation which update the velocity which is called the "Inertia Factor"
the complete update velocity equation is :
i read in one article in section 2.3 "Ineteria Factor" that:
"This variation of the algorithm aims to balance two possible PSO tendencies (de-
pendent on parameterization) of either exploiting areas around known solutions
or explore new areas of the search space. To do so this variation focuses on the
momentum component of the particles' velocity equation 2. Notice that if you
remove this component the movement of the particle has no memory of the pre-
vious direction of movement and it will always explore close to a found solution.
On the other hand if the velocity component is used, or even multiplied by a w
(inertial weight, balances the importance of the momentum component) factor
the particle will tend to explore new areas of the search space since it cannot
easily change its velocity towards the best solutions. It must rst \counteract"
the momentum previously gained, in doing so it enables the exploration of new
areas with the time \spend counteracting" the previous momentum. This vari-
ation is achieved by multiplying the previous velocity component with a weight
value, w."
the full pdf at: https://www.google.com.eg/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CDIQFjAA&url=http%3A%2F%2Fweb.ist.utl.pt%2F~gdgp%2FVA%2Fdata%2Fpso.pdf&ei=0HwrUaHBOYrItQbwwIDoDw&usg=AFQjCNH8vChXHXWz_ydHxJKAY0cUa94n-g
but i can't also imagine how physicaly or numerically this is happend and how this factor affect going from exploration level to exploitative level ,so need a numerical example to see how it's work and imagine how it's work.
also ,in Genetic Algorithm there's a schema theorem which is a proof of GA success of finding optimum solution,is there's such athoerm for PSO.
It's not easy to explain PSO using mathematics (see Wikipedia article for example).
But you can think like this: the equation has 3 parts:
particle speed = inertia + local memory + global memory
So you control the 'importance' of these components by varying the coefficientes in each part.
There's no analytical way to see this, unless you make the stocastic part constant and ignore things like particle-particle interation.
Exploit: take advantage of the best know solutions (local and global).
Explore: search in new directions, but don't ignore the best know solutions.
In a nutshell, you can control how much importance to give for the particle current speed (inertia), the particle memory of the best know solution, and the particle memory of the swarm best know solution.
I hope it can help you!
Br's
Inertia was not the part of the original PSO algorithm introduced by Kennedy and Eberhart in 1995. It's been three years until Shi and Eberhart published this extension and showed (to some extent) that it works better.
One can set that value to a constant (supposedly [0.8 to 1.2] is best).
However, the point of the parameter is to balance exploitation and exploration of space, and
authors got best results when they defined the parameter with a linear function, that decreases over time from [1.4 to 0].
Their rationale was that first one should exploit solutions to find a good seed and later exploit area around the seed.
My feeling about it is that the closer you are to 0, the more chaotic turns particles make.
For a detailed answer refer to Shi, Eberhart 1998 - "A modified Particle Swarm Optimizer".
Inertia controls the influence of the previous velocity.
When high, cognitive and social components are less relevant. (particle keeps going its way, exploring new portions of the space)
When low, particle explores better the space where the best-so-far optimum has been found
Inertia can change over time: Start high, later decrease