Too many constraints in the optimization experiments - anylogic

I'm trying to run an optimization experiment that has 2460 constraints. I have defined them one by one on the constraints section. I used x1, x2, x3...etc to reduce the number of bytes as recommended by anylogic support according to this article in the article : https://noorjax.com/2018/10/17/your-agent-is-too-big-memory-problem/
However, I'm getting the following error
The code of method reset() is exceeding the 65535 bytes limit
in Optimization.java, the error is in the following line:
#Override
#AnylogicInternalCodegenAPI
public void reset() {
setLogBufferLength( 65535 );
I think this is because Anylogic is storing the constraints as "constraints" or something like this, so reducing the name of the constraints itself to be (x1, x2...) does not help.
Is there a way to load constraints from a data file instead of hardcoding them in the constraints panel of the optimization experiment? to avoid the 65535 bytes limit in the Java code
Or do you have any other suggestions?
(I know 2460 are too many constraints, but I need them if I don't want to reduce the number of parameters and I don't want to relax constraints)

Related

Or-tools cp_sat solver is inconsistent in results

I have an optimization problem and I am using or-tools cp_sat solver. The number of variables is around 3500 (all boolean) but the number of constraints is huge (~750000). Out of 3500 variables, ~3000 are directly dependent on the other 500.
There are 2 scenarios I tested:
With a simple objective function depending on ~3000 constraint variables.
With a complex objective function depending on ~3000*3000 new variables, where each new variable is pairwise logical_and of the variables in (1).
For each case, we seed the solver with hints for ~500 variables.
For 1, it cannot find an optimal solution in reasonable time. After around 30-45 minutes of runtime, the improvement to the objective function is negligible, but the solutions are satisfactory.
For 2, behavior is weird. Around half of the time, it claims that the problem is INFEASIBLE, half of the time, claims that it found OPTIMAL solution, but only returns back the solution implied by the hints. Only rarely (less than a couple percent of the runs), it does some optimization and returns FEASIBLE.
In addition, case 1 uses around 4-6 GB of memory but case 2 uses 100-120 GB of memory.
Is the behavior in case 2 expected? How should I approach debugging this?
For case 2, the problem become very big. You are creating 9M Boolean variables.
Are you using multithreading ?
Can you try reducing the size of the model and see if this is still flaky ?
Is the problem creation deterministic ?
Are you using large coefficient ? Is it possible you are hitting an integer overflow error ?
Thanks

in genetic algorithm, How to deal with binary representation of constraints of functions?

For example, 0<=x<=31, the length of binary form of 31 is 5, since 31=11111 in base 2.
However, how to deal with, say, 0<=x<=25, if I keep length 5, numbers like 11110(30) may be generated, which exceeds 11001(25).
I wonder if there is a mapping which could solve this.
Thanks a lot!
If I understand you correctly, you are asking how to deal with automatically generated solutions that fall outside the constraint you have. In this case you have several options, firstly you could simply kill these invalid solutions and generate more until one fits within your constraint. The better option is to normalise all of your values within a specified range e.g. 0 to 31 or 0 to 64 etc.
I have an example of this type of normalisation in the Evaluate Fitness function of this example.
http://johnnewcombe.net/blog/gaf-part-2/
The code is based around the Genetic Algorithm Framework for .Net but the technique can be applied to any library or home grown algorithm.

How to optimize more than 3 objective functions on MATLAB? gamultibj is not efficient

I am using MATLAB gamultiobj optimization
as I have 6 to 12 objective functions; the gamultiobj function inefficiently handling the problem, always terminated because the number of generations exceeded, not because the changes of the objective functions become smaller
I looked at the gamultiobj options documentations, but it didn't help
http://www.mathworks.com/help/gads/examples/multiobjective-genetic-algorithm-options.html
1- how can I increase the capability of gamultiobj function to handle this number of objective functions?
2- are there a better way at all (using MATLAB)?
Well,
this is my update:
1- I increased the number of generations, the population size, and assigned proper initial population using the common ga options, it worked better (I didn't know that they are working with gamultiobj too, but I knew, it isn't stated anywhere in the documentation explicitly).
2- after running and inspecting the results I realized that gamultiobj can handle many objective functions efficiently providing that they are independent. As long as the objective functions are strongly dependent (which is the case of my problem, unfortunately) the gamultiobj solver's efficiency dramatically decreases.
thanks !
You should increase the number of generations, possibly play with the options such as crossover, mutation, the constraint bounds in which you're going to get the solution.
The bounds are to specified correctly. and the initial population is also pretty much needed to get it to the correct set of parameters that you want to optimize

Simulink: PID Controller - difference between back-calculation and clamping for anti-windup?

I need to implement an anti-windup (output limitation) for my PID controller. Simulink is offering two options: back calculation and clamping (documentation) which seem to deliver equal results. I know what back calculation is doing mathematically. It requires to define the back-calculation gain Kb. This gain is dependent on how long my controller is saturated, therefore it is actually a dynamic value (because I may have a high variation of saturation times). Do you see a way to control this value? (in this case it probably would be necessary to build my own PID Controller as shown in the documentation above or in the picture below.
Which brings me to the question, what is clamping actually doing? And what are other differences? Which one is faster, which one is more robust against stiff slopes? Does anybody has experiences using both?
Not sure if this fully answers the question, but the PID Controller documentation page, explains a bit more about clamping:
clamping
Stops integration when the sum of the block components
exceeds the output limits and the integrator output and block input
have the same sign. Resumes integration when the sum of the block
components exceeds the output limits and the integrator output and
block input have opposite sign. The integrator portion of the block
is:
The clamping circuit implements the logic necessary to determine whether integration continues.
If you select the clamping option and look under the mask, you can probably see the details of the clamping circuit.
Additionally to am304's answer there are some more things to consider.
Clamping
Clamping will always work. It detects when there is integrator overflow and sets the integral path of the PID-controller to zero to avoid windup by using a simple switch.
Clamping is a commmonly used anti windup method, especially in case of digital control systems. In serious applications however, there is also forward clamping involved - evaluating the controller input as well. This mechanism must me implemented manually.
Back Calculation
Back Calculation highly depends on the back calculation coefficient Kb. If you don't know how to actually calculate the parameter Kb don't use back-calculation. This method calculates the difference between the actual controller output and the saturated output and subtracts it from the I-Gain path, amplified by Kb.
In most of cases the default value Kb = 1 will lead to worse results than clamping, it is even possible that it has no effect at all. Kb should be calculated based on the sampling time or
in case a D-Gain is involded, based on D- and I-Gain. Appropriate literatur should be consulted to calculate the coefficient. Back calculation with a properly set coeffient enables better dynamics than clamping!

Virtual Memory Page Replacement Algorithms

I have a project where I am asked to develop an application to simulate how different page replacement algorithms perform (with varying working set size and stability period). My results:
Vertical axis: page faults
Horizontal axis: working set size
Depth axis: stable period
Are my results reasonable? I expected LRU to have better results than FIFO. Here, they are approximately the same.
For random, stability period and working set size doesnt seem to affect the performance at all? I expected similar graphs as FIFO & LRU just worst performance? If the reference string is highly stable (little branches) and have a small working set size, it should still have less page faults that an application with many branches and big working set size?
More Info
My Python Code | The Project Question
Length of reference string (RS): 200,000
Size of virtual memory (P): 1000
Size of main memory (F): 100
number of time page referenced (m): 100
Size of working set (e): 2 - 100
Stability (t): 0 - 1
Working set size (e) & stable period (t) affects how reference string are generated.
|-----------|--------|------------------------------------|
0 p p+e P-1
So assume the above the the virtual memory of size P. To generate reference strings, the following algorithm is used:
Repeat until reference string generated
pick m numbers in [p, p+e]. m simulates or refers to number of times page is referenced
pick random number, 0 <= r < 1
if r < t
generate new p
else (++p)%P
UPDATE (In response to #MrGomez's answer)
However, recall how you seeded your input data: using random.random,
thus giving you a uniform distribution of data with your controllable
level of entropy. Because of this, all values are equally likely to
occur, and because you've constructed this in floating point space,
recurrences are highly improbable.
I am using random, but it is not totally random either, references are generated with some locality though the use of working set size and number page referenced parameters?
I tried increasing the numPageReferenced relative with numFrames in hope that it will reference a page currently in memory more, thus showing the performance benefit of LRU over FIFO, but that didn't give me a clear result tho. Just FYI, I tried the same app with the following parameters (Pages/Frames ratio is still kept the same, I reduced the size of data to make things faster).
--numReferences 1000 --numPages 100 --numFrames 10 --numPageReferenced 20
The result is
Still not such a big difference. Am I right to say if I increase numPageReferenced relative to numFrames, LRU should have a better performance as it is referencing pages in memory more? Or perhaps I am mis-understanding something?
For random, I am thinking along the lines of:
Suppose theres high stability and small working set. It means that the pages referenced are very likely to be in memory. So the need for the page replacement algorithm to run is lower?
Hmm maybe I got to think about this more :)
UPDATE: Trashing less obvious on lower stablity
Here, I am trying to show the trashing as working set size exceeds the number of frames (100) in memory. However, notice thrashing appears less obvious with lower stability (high t), why might that be? Is the explanation that as stability becomes low, page faults approaches maximum thus it does not matter as much what the working set size is?
These results are reasonable given your current implementation. The rationale behind that, however, bears some discussion.
When considering algorithms in general, it's most important to consider the properties of the algorithms currently under inspection. Specifically, note their corner cases and best and worst case conditions. You're probably already familiar with this terse method of evaluation, so this is mostly for the benefit of those reading here whom may not have an algorithmic background.
Let's break your question down by algorithm and explore their component properties in context:
FIFO shows an increase in page faults as the size of your working set (length axis) increases.
This is correct behavior, consistent with Bélády's anomaly for FIFO replacement. As the size of your working page set increases, the number of page faults should also increase.
FIFO shows an increase in page faults as system stability (1 - depth axis) decreases.
Noting your algorithm for seeding stability (if random.random() < stability), your results become less stable as stability (S) approaches 1. As you sharply increase the entropy in your data, the number of page faults, too, sharply increases and propagates the Bélády's anomaly.
So far, so good.
LRU shows consistency with FIFO. Why?
Note your seeding algorithm. Standard LRU is most optimal when you have paging requests that are structured to smaller operational frames. For ordered, predictable lookups, it improves upon FIFO by aging off results that no longer exist in the current execution frame, which is a very useful property for staged execution and encapsulated, modal operation. Again, so far, so good.
However, recall how you seeded your input data: using random.random, thus giving you a uniform distribution of data with your controllable level of entropy. Because of this, all values are equally likely to occur, and because you've constructed this in floating point space, recurrences are highly improbable.
As a result, your LRU is perceiving each element to occur a small number of times, then to be completely discarded when the next value was calculated. It thus correctly pages each value as it falls out of the window, giving you performance exactly comparable to FIFO. If your system properly accounted for recurrence or a compressed character space, you would see markedly different results.
For random, stability period and working set size doesn't seem to affect the performance at all. Why are we seeing this scribble all over the graph instead of giving us a relatively smooth manifold?
In the case of a random paging scheme, you age off each entry stochastically. Purportedly, this should give us some form of a manifold bound to the entropy and size of our working set... right?
Or should it? For each set of entries, you randomly assign a subset to page out as a function of time. This should give relatively even paging performance, regardless of stability and regardless of your working set, as long as your access profile is again uniformly random.
So, based on the conditions you are checking, this is entirely correct behavior consistent with what we'd expect. You get an even paging performance that doesn't degrade with other factors (but, conversely, isn't improved by them) that's suitable for high load, efficient operation. Not bad, just not what you might intuitively expect.
So, in a nutshell, that's the breakdown as your project is currently implemented.
As an exercise in further exploring the properties of these algorithms in the context of different dispositions and distributions of input data, I highly recommend digging into scipy.stats to see what, for example, a Gaussian or logistic distribution might do to each graph. Then, I would come back to the documented expectations of each algorithm and draft cases where each is uniquely most and least appropriate.
All in all, I think your teacher will be proud. :)