I'm just starting in Minizinc and I would some help on this.
How can I write the following constraints:
I want to buy at most 4 GROUP offers, spend at most 100$ and I can buy
only one item from every group. Maximize quality.
int: items = 10;
set of int: GROUPS = 0..itms;
set of int: PRODUCTS = 1..7;
set of int:BUYS = 1..4;
int : max_spent = 100;
array[GROUPS] of set of int : package = array1d(GROUPS,[{},{1,2,3},{4,7},{3,6},{1,4,5},{3,4,7},{1,2,5},{4,6},{3,7},{3,7,5},{2,3}]);
array[GROUPS] of int: package_price = array1d(GROUPS,[0,5,5,25,10,12,20,40,55,52,10]);
array[GROUPS] of int: package_quality = array1d(GROUPS,[0,7,2,7,2,3,5,4,9,6,5]);
The desired output should be something like:
{3,7} {4,6} {1,2,5} {}
quality = 10;
price = 97;
--- Update ---
So far I tried:
var int : will_buy;
will_buy = sum(i in BUYS)(package_price[i]);
constraint will_buy <= max_spent;
var int : quality;
quality = sum(i in GROUPS)(package_quality[i]);
array[GROUPS] of var BUYS: index;
include "element.mzn";
constraint forall(t in GROUPS)
(
element(index[t], package_price, package_price[t])
/\
element(index[t], package_quality, package_quality[t])
);
:/
Your problem is a variation on the well known knapsack problem. A solution to this problem is given in the MiniZinc tutorial, chapter 3.6. Understanding the MiniZinc model for the knapsack model should give you a guide to solving this problem.
Alternatively, you might want to have a look at the global packing constraints, these might make the modeling easier.
Related
A little bit of background. I'm trying to make a model for clustering a Design Structure Matrix(DSM). I made a draft model and have a couple of questions. Most of them are not directly related to DSM per se.
include "globals.mzn";
int: dsmSize = 7;
int: maxClusterSize = 7;
int: maxClusters = 4;
int: powcc = 2;
enum dsmElements = {A, B, C, D, E, F,G};
array[dsmElements, dsmElements] of int: dsm =
[|1,1,0,0,1,1,0
|0,1,0,1,0,0,1
|0,1,1,1,0,0,1
|0,1,1,1,1,0,1
|0,0,0,1,1,1,0
|1,0,0,0,1,1,0
|0,1,1,1,0,0,1|];
array[1..maxClusters] of var set of dsmElements: clusters;
array[1..maxClusters] of var int: clusterCard;
constraint forall(i in 1..maxClusters)(
clusterCard[i] = pow(card(clusters[i]), powcc)
);
% #1
% constraint forall(i, j in clusters where i != j)(card(i intersect j) == 0);
% #2
constraint forall(i, j in 1..maxClusters where i != j)(
card(clusters[i] intersect clusters[j]) == 0
);
% #3
% constraint all_different([i | i in clusters]);
constraint (clusters[1] union clusters[2] union clusters[3] union clusters[4]) = dsmElements;
var int: intraCost = sum(i in 1..maxClusters, j, k in clusters[i] where k != j)(
(dsm[j,k] + dsm[k,j]) * clusterCard[i]
) ;
var int: extraCost = sum(el in dsmElements,
c in clusters where card(c intersect {el}) = 0,
k,j in c)(
(dsm[j,k] + dsm[k,j]) * pow(card(dsmElements), powcc)
);
var int: TCC = trace("\(intraCost), \(extraCost)\n", intraCost+extraCost);
solve maximize TCC;
Question 1
I was under the impression, that constraints #1 and #2 are the same. However, seems like they are not. The question here is why? What is the difference?
Question 2
How can I replace constraint #2 with all_different? Does it make sense?
Question 3
Why the trace("\(intraCost), \(extraCost)\n", intraCost+extraCost); shows nothing in the output? The output I see using gecode is:
Running dsm.mzn
intraCost, extraCost
clusters = array1d(1..4, [{A, B, C, D, E, F, G}, {}, {}, {}]);
clusterCard = array1d(1..4, [49, 0, 0, 0]);
----------
<sipped to save space>
----------
clusters = array1d(1..4, [{B, C, D, G}, {A, E, F}, {}, {}]);
clusterCard = array1d(1..4, [16, 9, 0, 0]);
----------
==========
Finished in 5s 419msec
Question 4
The expression constraint (clusters[1] union clusters[2] union clusters[3] union clusters[4]) = dsmElements;, here I wanted to say that the union of all clusters should match the set of all nodes. Unfortunately, I did not find a way to make this big union more dynamic, so for now I just manually provide all clusters. Is there a way to make this expression return union of all sets from the array of sets?
Question 5
Basically, if I understand it correctly, for example from here, the Intra-cluster cost is the sum of all interactions within a cluster multiplied by the size of the cluster in some power, basically the cardinality of the set of nodes, that represents the cluster.
The Extra-cluster cost is a sum of interactions between some random element that does not belong to a cluster and all elements of that cluster multiplied by the cardinality of the whole space of nodes to some power.
The main question here is are the intraCost and extraCost I the model correct (they seem to be but still), and is there a better way to express these sums?
Thanks!
(Perhaps you would get more answers if you separate this into multiple questions.)
Question 3:
Here's an answer on the trace question:
When running the model, the trace actually shows this:
intraCost, extraCost
which is not what you expect, of course. Trace is in effect when creating the model, but at that stage there is no value of these two decision values and MiniZinc shows only the variable names. They got some values to show after the (first) solution is reached, and can then be shown in the output section.
trace is mostly used to see what's happening in loops where one can trace the (fixed) loop variables etc.
If you trace an array of decision variables then they will be represented in a different fashion, the array x will be shown as X_INTRODUCED_0_ etc.
And you can also use trace for domain reflection, e.g. using lb and ub to get the lower/upper value of the domain of a variable ("safe approximation of the bounds" as the documentation states it: https://www.minizinc.org/doc-2.5.5/en/predicates.html?highlight=ub_array). Here's an example which shows the domain of the intraCost variable:
constraint
trace("intraCost: \(lb(intraCost))..\(ub(intraCost))\n")
;
which shows
intraCost: -infinity..infinity
You can read a little more about trace here https://www.minizinc.org/doc-2.5.5/en/efficient.html?highlight=trace .
Update Answer to question 1, 2 and 4.
The constraint #1 and #2 means the same thing, i.e. that the elements in clusters should be disjoint. The #1 constraint is a little different in that it loops over decision variables while the #2 constraint use plain indices. One can guess that #2 is faster since #1 use the where i != j which must be translated to some extra constraints. (And using i < j instead should be a little faster.)
The all_different constraint states about the same and depending on the underlying solver it might be faster if it's translated to an efficient algorithm in the solver.
In the model there is also the following constraint which states that all elements must be used:
constraint (clusters[1] union clusters[2] union clusters[3] union clusters[4]) = dsmElements;
Apart from efficiency, all these constraints above can be replaced with one single constraint: partition_set which ensure that all elements in dsmElements must be used in clusters.
constraint partition_set(clusters,dsmElements);
It might be faster to also combine with the all_different constraint, but that has to be tested.
This seems like such a simple problem, but I can't find a simple way to represent this in MiniZinc.
include "globals.mzn";
int: target;
int: max_length;
var 1..max_length: length;
array[1..length] of int: t;
constraint sum(t) = target;
constraint alldifferent(t);
solve minimize length;
This program errors with:
MiniZinc: type error: type-inst must be par set but is ``var set of int'
Is there a clean/simple way to represent this problem in MiniZinc?
Arrays in MiniZinc have a fixed size. The compiler is therefore saying that array[1..length] of int: t is not allowed, because length is a variable.
The alternative that MiniZinc offers is arrays with optional types, these are values that might exist. This means that when you write something like [t | t in 1..length], it will actually give you an array of 1..maxlength, but some elements can be marked as absent/<>.
For this particular problem you are also overlooking the fact that t should itself be a array of variables. The values of t are not yet known when at compile-time. A better way to formulate this problem would thus be to allow the values of t to be 0 when they are beyond the chosen length:
include "globals.mzn";
int: target;
int: max_length;
var 1..max_length: length;
array[1..max_length] of var int: t;
constraint sum(t) = target;
constraint alldifferent_except_0(t);
constraint forall(i in length+1..max_length) (t[i] = 0);
solve minimize length;
The next step to improve the model would be to ensure that the initial domain of t makes sense and instead of being all different, forcing an ordering would be equivalent, but eliminate some symmetry in the possible solutions.
I'm still having troubles with AnyLogic...I'm developing an epidemic SIRS model and I want to define my own network.
In particular, I have this matrix that defines the daily average number of contacts between age class
and therefore I want every agent to establish contact with other agents according to this matrix...it is driving me crazy :S
AgeClass is a parameter calculated with the following function
I thought to setup an event that occurs once at the beginning with the following code
Now I am saying "connect n times to a random agent"...what I want to say is "connect n times to a random agent with AgeClass k" is there a way to do so?
thanks for the support!
ps when I write int i = AgeClass i takes the value of the parameter AgeClass of the agent that is running the code, right? So i it will be different for different agents?
Probably, you have already found a solution. Here a way to do it:
Regarding age, you don't need that big if/else if sequence. Just do something like this:
int ageClass = 0; // a variable of agents
ageClass = (int) floor(age / 5.0);
if (age >= 70.0 ) ageClass == 14; // just to be sure that max class is 14
return ageClass;
Regarding the network. I would create a function named setup, so that you can put it in agent actions, on startup, e.g. setup();
You can create a link to agents object at the agent level (Person in my code, I use a connection object named contacts). The function would be something like:
// loop through age groups
for (int i = 0; i < network[0].length; i++) {
ArrayList<Person> ageGroupPeople = new ArrayList<Person>();
for (Person p : population ) {
if ( p.ageClass == i ) { ageGroupPeople.add(p) } \\ create pool of potential alters by age
}
\\ create network per agent
for (Person ego : population ) {
for (int k = 0; k < poisson(network[ego.ageClass][i]); k++) {
Person alter = randomFrom(ageGroupPeople);
if ( ego != alter ) { ego.contacts.connectTo(alter);}
}
}
I haven't checked the code and how slow might be, it is just one way to do it.
In AnyLogic, you can represent a matrix as two-dimensional Java array:
http://help.anylogic.com/topic/com.xj.anylogic.help/html/code/Arrays.html
After initializing the matrix, you may define custom contact network using the element 'Link to agents':
http://help.anylogic.com/topic/com.xj.anylogic.help/html/agentbased/Link.html
I have a struct, that's a <1x1 struct>, and I'm trying to edit a field in the struct based on the values. The field is called GeoDist_Actual and the struct is called GeoDist_str. The field GeoDist_Actual is a <262792x1 double>, and this is the code I was trying to use in order to get rid of the values that are greater than 1.609344e+05.
i =1;
for i=i:size(GeoDist_str.GeoDist_Actual)
if GeoDist_str.GeoDist_Actual(i,1 > 1.609344e+05
GeoDist_str.GeoDist_Acutal(i,1) = [];
end
end
How would I append or alter this code in order to make it function like I'm aiming? I considered setting all the values to 0, but I'm going to have to go backwards from this in order to get back GPS values, doing a reverse-Vincenty(spherical) calculation, and I'd like to just completely get rid of the values that don't comply with the if condition.
If I can narrow down the question at all, let me know, and thank you for your help in advance!
Edit: I've noticed that when I changed out the section
GeoDist_str.GeoDist_Actual(i,1) = [];
for
GeoDist_str.GeoDist_Actual(i,1) = 0;
It didn't actually solve anything, instead it didn't access the field "GeoDist_Actual" within the struct "GeoDist_str", it just created a mirror field with values of 0.
Consider this example:
% a 10-by-1 vector
x = [1;2;3;4;5;6;7;8;9;10];
% remove entries where the value is less than five
x(x<5) = [];
This is called logical indexing, no need for loops.
Consider the following simple example:
A.a = 1:5;
A =
a: [1 2 3 4 5]
now delete all elements bigger 3;
A.a = A.a( ~(A.a > 3) );
A =
a: [1 2 3]
or alternatively:
A.a( A.a > 3 ) = []
For your case it's a little more bulky:
GeoDist_str.GeoDist_Actual = ...
GeoDist_str.GeoDist_Actual( ...
~(GeoDist_str.GeoDist_Actual > 1.609344e+05) )
Consider the following 2 scenarios:
boolean b = false;
int i = 0;
while(i++ < 5) {
b = true;
}
OR
boolean b = false;
int i = 0;
while(i++ < 5) {
if(!b) {
b = true;
}
}
Which is more "costly" to do? If the answer depends on used language/compiler, please provide. My main programming language is Java.
Please do not ask questions like why would I want to do either.. They're just barebone examples that point out the relevant: should a variable be set the same value in a loop over and over again or should it be tested on every loop that it holds a value needed to change?
Please do not forget the rules of Optimization Club.
The first rule of Optimization Club is, you do not Optimize.
The second rule of Optimization Club is, you do not Optimize without measuring.
If your app is running faster than the underlying transport protocol, the optimization is over.
One factor at a time.
No marketroids, no marketroid schedules.
Testing will go on as long as it has to.
If this is your first night at Optimization Club, you have to write a test case.
It seems that you have broken rule 2. You have no measurement. If you really want to know, you'll answer the question yourself by setting up a test that runs scenario A against scenario B and finds the answer. There are so many differences between different environments, we can't answer.
Have you tested this? Working on a Linux system, I put your first example in a file called LoopTestNoIf.java and your second in a file called LoopTestWithIf.java, wrapped a main function and class around each of them, compiled, and then ran with this bash script:
#!/bin/bash
function run_test {
iter=0
while [ $iter -lt 100 ]
do
java $1
let iter=iter+1
done
}
time run_test LoopTestNoIf
time run_test LoopTestWithIf
The results were:
real 0m10.358s
user 0m4.349s
sys 0m1.159s
real 0m10.339s
user 0m4.299s
sys 0m1.178s
Showing that having the if makes it slight faster on my system.
Are you trying to find out if doing the assignment each loop is faster in total run time than doing a check each loop and only assigning once on satisfaction of the test condition?
In the above example I would guess that the first is faster. You perform 5 assignments. In the latter you perform 5 test and then an assignment.
But you'll need to up the iteration count and throw in some stopwatch timers to know for sure.
Actually, this is the question I was interested in… (I hoped that I’ll find the answer somewhere to avoid own testing. Well, I didn’t…)
To be sure that your (mine) test is valid, you (I) have to do enough iterations to get enough data. Each iteration must be “long” enough (I mean the time scale) to show the true difference. I’ve found out that even one billion iterations are not enough to fit to time interval that would be long enough… So I wrote this test:
for (int k = 0; k < 1000; ++k)
{
{
long stopwatch = System.nanoTime();
boolean b = false;
int i = 0, j = 0;
while (i++ < 1000000)
while (j++ < 1000000)
{
int a = i * j; // to slow down a bit
b = true;
a /= 2; // to slow down a bit more
}
long time = System.nanoTime() - stopwatch;
System.out.println("\\tasgn\t" + time);
}
{
long stopwatch = System.nanoTime();
boolean b = false;
int i = 0, j = 0;
while (i++ < 1000000)
while (j++ < 1000000)
{
int a = i * j; // the same thing as above
if (!b)
{
b = true;
}
a /= 2;
}
long time = System.nanoTime() - stopwatch;
System.out.println("\\tif\t" + time);
}
}
I ran the test three times storing the data in Excel, then I swapped the first (‘asgn’) and second (‘if’) case and ran it three times again… And the result? Four times “won” the ‘if’ case and two times the ‘asgn’ appeared to be the better case. This shows how sensitive the execution might be. But in general, I hope that this has also proven that the ‘if’ case is better choice.
Thanks, anyway…
Any compiler (except, perhaps, in debug) will optimize both these statements to
bool b = true;
But generally, relative speed of assignment and branch depend on processor architecture, and not on compiler. A modern, super-scalar processor perform horribly on branches. A simple micro-controller uses roughly the same number of cycles per any instruction.
Relative to your barebones example (and perhaps your real application):
boolean b = false;
// .. other stuff, might change b
int i = 0;
// .. other stuff, might change i
b |= i < 5;
while(i++ < 5) {
// .. stuff with i, possibly stuff with b, but no assignment to b
}
problem solved?
But really - it's going to be a question of the cost of your test (generally more than just if (boolean)) and the cost of your assignment (generally more than just primitive = x). If the test/assignment is expensive or your loop is long enough or you have high enough performance demands, you might want to break it into two parts - but all of those criteria require that you test how things perform. Of course, if your requirements are more demanding (say, b can flip back and forth), you might require a more complex solution.