I am trying to use minizinc to build me a matrix of true/false, where each column satisfies certain conditions (at least 3 true values, and there must be an odd number of true values). So far, so good - but when I add additional constraints to make it so that each column is different (from all other columns), I don't get consistent results:
With the commented out version below, with minizinc 2.1.7 on Ubuntu-on-WSL (Windows), it finds a solution pretty much immediately. However, any version of minizinc (2.1.7, 2.2.0, 2.3.1) on either a Mac, or on an ArchLinux install, can't satisfy the constraints (at least not within a few minutes). All of this is with Gecode.
Chuffed works fine to satisfy the constraints, again with the commented-out version of the code rather than the one below.
However, if I solve to minimize certain values (see second snippet), then Chuffed doesn't find any solution anymore (again, at least not within a few minutes). I would have expected it to at least find the same solution as it finds for "satisfy".
What am I doing wrong? Are there better ways of writing the constraint of "distinct columns" that works more consistently? I wouldn't think this is a particular hard problem for a constraint solver, so I suspect it's really an issue with how I've written the constraints.
I would like to keep the H matrix as a 2d matrix of true/false, as there's other constraints that benefit from having it in that shape.
int: k = 7;
int: n = 47;
array[1..k,1..n] of var bool: H;
array[1..k] of var bool : flip_bits;
predicate all_different_int(array[int] of var int: x) =
forall(i,j in index_set(x) where i < j) ( x[i] != x[j] );
constraint forall(j in 1..n)(
sum(i in 1..k)(
if H[i,j] then 1 else 0 endif
) mod 2 > 0
);
constraint forall(j in 1..n)(
sum(i in 1..k)(
if H[i,j] then 1 else 0 endif
) >= 3
);
%array[1..n] of var int: H_t;
%constraint forall(j in 1..n)(
% H_t[j] = sum(i in 1..k)(
% if H[i,j] then pow(2,i) else 0 endif
% )
%);
%constraint all_different_int(H_t);
constraint all_different_int(
[sum(i in 1..k)(if H[i,j] then pow(2,i) else 0 endif) | j in 1..n]
);
solve satisfy;
var int: z2 = sum(i in 1..k, j in 1..n)(if H[i,j] then 1 else 0 endif);
var int: z1 = max(i in 1..k)(sum (j in 1..n)(if H[i,j] then 1 else 0 endif));
solve minimize z1*1000+z2;
Here is a rewritten version of the model. It is solved by Chuffed in 43s, see below for an optimal solution.
The h_t is kept since it seems to "drive" the solution better using the all_different constraint. The lex2 constraint is a symmetry breaking constraint which ensure that the matrix H should be lexicographically ordered.
Also, some other constraints where rewritten as well, e.g. removed sum ... ( if H[i.j] then 1 else 0 endif) in some constraints when just sum .. (H[i,j]) was sufficient.
include "globals.mzn";
int: k = 7;
int: n = 47;
array[1..k,1..n] of var bool: H;
array[1..k] of var bool : flip_bits;
constraint forall(j in 1..n)(
sum(i in 1..k)(H[i,j]) mod 2 = 1);
constraint forall(j in 1..n)(
sum(i in 1..k)(H[i,j]) >= 3);
array[1..n] of var int: H_t;
constraint
forall(j in 1..n)(
H_t[j] = sum(i in 1..k)(
if H[i,j] then pow(2,i) else 0 endif
)
);
constraint all_different_int(H_t);
var int: z2 = sum(i in 1..k, j in 1..n)(H[i,j]);
var int: z1 = max(i in 1..k)(sum (j in 1..n)(H[i,j]));
constraint
z1 > 0 /\ z2 > 0 /\ z > 0
;
% symmetry breaking
constraint lex2(H);
var int: z = z1*1000+z2;
% solve :: int_search(H_t, first_fail, indomain_split, complete) minimize z;
% solve satisfy;
solve minimize z;
output [
"z: \(z)\n",
% "H: \(H)\n",
"H_t: \(H_t)\n"
]
++
[
if j = 1 then "\n" else "" endif++
"\(H[i,j]*1)"
| i in 1..k, j in 1..n
];
The optimal solution is found after 43s using the following command
$ time minizinc test96.mzn -a -s --solver chuffed
z: 24165
H_t: [224, 208, 176, 112, 200, 168, 104, 152, 88, 56, 248, 196, 164, 100, 148, 84, 52, 140, 76, 44, 236, 28, 124, 194, 162, 98, 146, 82, 50, 242, 138, 74, 42, 234, 26, 218, 134, 70, 38, 230, 22, 182, 118, 14, 158, 94, 62]
00000000000000000000000111111111111111111111111
00000000000111111111111000000000000011111111111
00001111111000000111111000000011111100000001111
01110001111000111000011000111100001100001110111
10110110011011001001101011001100110000110110001
11011010101101010010101101010101010101010010010
11101101001110100100100110100110010110010100100
----------
==========
Some other comments:
The constraint that all columns should be different can be stated by the following
constraint
forall(j in 2..n) (
sum(i in 1..k) ( H[i,j] != H[i,j-1]) > 0
)
;
but, it's not as efficient (in this model) as the all_different variant.
When a solver seems to be slow, a tip is to test different search strategies. Here's one that seems to be a bit faster when I started to play with the model:
solve :: int_search(H_t, first_fail, indomain_split, complete) minimize z;
But when adding the lex2 constraint it was slower than a plain solve minimize z.
It is often a good thing to play with the search strategies...
Related
I have recently started studying minizinc, but I have got this strange behaviour in my program.
.dzn
n = 5;
c = 2;
.mzn
include "globals.mzn";
int: n;
int: c;
set of int: num_deliveries = 1..n-1;
int: headquarter = 1;
set of int: num_places = 1..n;
set of int: deliveries = 2..n;
set of int: couriers = 1..c;
set of int: num_max_deliveries = 1..n+2;
set of int: schedule_domain = 0..n;
int: first_place_idx = 1;
int: last_place_idx = n+2;
array[couriers,num_max_deliveries] of var schedule_domain: schedule;
array[1..2*n] of int: total = [schedule[i,j]|i,j in num_max_deliveries where i<=2 /\ j != first_place_idx /\ j!= last_place_idx];
output ["len_without_variable = \(length([ k | k in total where k != 0]))"];
var int: len_cleaned = length([ k | k in total where k != 0]);
output ["len_with_variable = \(len_cleaned)\n"];
In particular, from these lines of code I have different results, even if they are equal.
output ["len_without_variable = \(length([ k | k in total where k != 0]))"];
var int: len_cleaned = length([ k | k in total where k != 0]);
output ["len_with_variable = \(len_cleaned)\n"];
Why does it happen?
Here is one output of the model is this (with total added):
len_without_variable = 2
len_with_variable = 10
total: [3, 3, 0, 0, 0, 0, 0, 0, 0, 0]
To be honest, I'm not sure if the two calculations of the number of non zero elements in total should be the same or not. Perhaps this is a bug in how length operates on decision variables (with a where condition), and it should be 2 (in this example). Until this is settled, you should probably avoid using length like this.
The output of len_without_variable - the one defined in the output section - operates on the actual and known values of the solution. So this might not be the exact thing as using length in a constraint/variable definition.
To calculate the number of non zero values you can use sum instead:
var 0..n num_non_zeros = sum([ 1 | k in total where k != 0]);
However, the construct ... where k != 0 creates temporary variables which makes the model larger than necessary so it's better to use the following:
var 0..n num_non_zeros = sum([ total[i] != 0 | i in 1..n]);
let's say I have:
n = 14
n is the result of the following sums of integers:
[5, 2, 7] -> 5 + 2 + 7 = 14 = n
[3, 4, 5, 2] -> 3 + 4 + 5 + 2 = 14 = n
[1, 13] -> 1 + 13 = 14 = n
[13, 1] -> 13 + 1 = 14 = n
[4, 3, 5, 2] -> 4 + 3 + 5 + 2 = 14 = n
...
I would need a hash function h so that:
h([5, 2, 7]) = h([3, 4, 5, 2]) = h([1, 13]) = h([13, 1]) = h([4, 3, 5, 2]) = h(...)
I.e. it doesn't matter the order of the integer terms and as long as their integer sum is the same, their hash should also the same.
I need to do this without computing the sum n, because the terms as well as n can be very high and easily overflow (they don't fit the bits of an int), that's why I am asking this question.
Are you aware or maybe do you have an insight on how I can implement such a hash function?
Given a list/sequence of integers, this hash function must return the same hash if the sum of the integers would be the same, but without computing the sum.
Thank you for your attention.
EDIT: I elaborated on #derpirscher's answer and modified his function a bit further as I had collisions on multiples of BIG_PRIME (this example is in JavaScript):
function hash(seq) {
const BIG_PRIME = 999999999989;
const MAX_SAFE_INTEGER_DIV_2_FLOOR = Math.floor(Number.MAX_SAFE_INTEGER / 2);
let h = 0;
for (i = 0; i < seq.length; i++) {
let value = seq[i];
if (h > MAX_SAFE_INTEGER_DIV_2_FLOOR) {
h = h % BIG_PRIME;
}
if (value > MAX_SAFE_INTEGER_DIV_2_FLOOR) {
value = value % BIG_PRIME;
}
h += value;
}
return h;
}
My question now would be: what do you think about this function? Are there some edge cases I didn't take into account?
Thank you.
EDIT 2:
Using the above function hash([1,2]); and hash([4504 * BIG_PRIME +1, 4504 * BIG_PRIME + 2]) will collide as mentioned by #derpirscher.
Here is another modified of version of the above function, which computes the modulo % BIG_PRIME only to one of the two terms if either of the two are greater than MAX_SAFE_INTEGER_DIV_2_FLOOR:
function hash(seq) {
const BIG_PRIME = 999999999989;
const MAX_SAFE_INTEGER_DIV_2_FLOOR = Math.floor(Number.MAX_SAFE_INTEGER / 2);
let h = 0;
for (let i = 0; i < seq.length; i++) {
let value = seq[i];
if (
h > MAX_SAFE_INTEGER_DIV_2_FLOOR &&
value > MAX_SAFE_INTEGER_DIV_2_FLOOR
) {
if (h > MAX_SAFE_INTEGER_DIV_2_FLOOR) {
h = h % BIG_PRIME;
} else if (value > MAX_SAFE_INTEGER_DIV_2_FLOOR) {
value = value % BIG_PRIME;
}
}
h += value;
}
return h;
}
I think this version lowers the number of collisions a bit further.
What do you think? Thank you.
EDIT 3:
Even though I tried to elaborate on #derpirscher's answer, his implementation of hash is the correct one and the one to use.
Use his version if you need such an hash function.
You could calculate the sum modulo some big prime. If you want to stay within the range of int, you need to know what the maximum integer is, in the language you are using. Then select a BIG_PRIME that's just below maxint / 2
Assuming an int to be 4 bytes, maxint = 2147483647 thus the biggest prime < maxint/2 would be 1073741789;
int hash(int[] seq) {
BIG_PRIME = 1073741789;
int h = 0;
for (int i = 0; i < seq.Length; i++) {
h = (h + seq[i] % BIG_PRIME) % BIG_PRIME;
}
return h;
}
As at every step both summands will always be below maxint/2 you won't get any overflows.
Edit
From a mathematical point of view, the following property which may be important for your use case holds:
(a + b + c + ...) % N == (a % N + b % N + c % N + ...) % N
But yeah, of course, as in every hash function you will have collisions. You can't have a hash function without collisions, because the size of the domain of the hash function (ie the number of possible input values) is generally much bigger than the the size of the codomain (ie the number of possible output values).
For your example the size of the domain is (in principle) infinite, as you can have any count of numbers from 1 to 2000000000 in your sequence. But your codomain is just ~2000000000 elements (ie the range of int)
Given a recursive loop similar to the following:
A = [5,2;0,2]
B = [5;6]
x = [0;7]
for i = 1:10
x(:,i+1) = A * x(:,i) + B
end
How can this represented without a loop?
Sample output:
[ 0, 19, 140, 797, 4186, 21339, 107520, 539257, 2699606, 13504679, 67536700;
7, 20, 46, 98, 202, 410, 826, 1658, 3322, 6650, 13360]
Here is a more mathy approach by solving the general formula for your recursion
u = pinv(A-eye(2))*B;
C = arrayfun(#(n) A^n*(x+u)-u,0:10,'UniformOutput',false);
M = cat(2,C{:});
which gives
M =
Columns 1 through 9:
0 19 140 797 4186 21339 107520 539257 2699606
7 20 46 98 202 410 826 1658 3322
Columns 10 and 11:
13504679 67536700
6650 13306
I think you are looking to create a recursive function. If so, the below might work for you.
A = [5,2;0,2]
B = [5;6]
x = [0;7]
x = myRecursive(A,B,x, 10)
function [x] = myRecursive(A,B,x,n)
x(:,end+1) = A * x(:,end) + B;
if size(x,2) <= n
x = myRecursive(A,B,x,n);
end
end
I created the following three dimensional mockup matrix:
mockup(:,:,1) = ...
[100, 100, 100; ...
103, 95, 100; ...
101, 85, 100; ...
96, 90, 102; ...
91, 89, 99; ...
97, 91, 97; ...
105, 83, 100];
mockup(:,:,2) = ...
[50, NaN, NaN; ...
47, NaN, 40; ...
45, 60, 45; ...
47, 65, 45; ...
51, 70, 45; ...
54, 65, 50; ...
62, 80, 55];
I also defined percentTickerAvailable = 0.5.
As a result, The columns represent equity prices from three different assets. For futher processing I need to manipulate the NaN values in the following way.
If the percentage of NaNs in any given ROW is greater than 1 - percentTickerAvailable, replace all values in these particular rows with NaNs. That is, if not enough assets have prices in that particular row, ignore the row completely.
If the percentage of NaNs in any given ROW is less or equal to 1 - percentTickerAvailable, replace the respective NaNs with -inf.
To be clear, "percentage of NaNs in any given ROW" is calculated as follows:
Number of NaNs in any given ROW divided by number of columns.
The adjusted mockup matrix should look like this:
mockupAdj(:,:,1) = ...
[100, 100, 100; ...
103, 95, 100; ...
101, 85, 100; ...
96, 90, 102; ...
91, 89, 99; ...
97, 91, 97; ...
105, 83, 100];
mockupAdj(:,:,2) = ...
[NaN, NaN, NaN; ...
47, -inf, 40; ...
45, 60, 45; ...
47, 65, 45; ...
51, 70, 45; ...
54, 65, 50; ...
62, 80, 55];
So far, I did the following:
function vout = ranking(vin, percentTickerAvailable)
percentNonNaN = 1 - sum(isnan(vin), 2) / size(vin, 2);
NaNIdx = percentNonNaN < percentTickerAvailable;
infIdx = percentNonNaN > percentTickerAvailable & ...
percentNonNaN < 1;
[~, ~, numDimVin] = size(vin);
for i = 1 : numDimVin
vin(NaNIdx(:,:,i) == 1, :, i) = NaN;
end
about = vin;
end % EoF
By calling mockupAdj = ranking(mockup, 0.5) this already transforms the first row in mockup(1,:,2)correctly to {'NaN', 'NaN', 'NaN'}. However, I am struggling with the second point. With infIdx I already successfully identified the rows that corresponds to the second condition. But I don't know how to correctly use that information in order to replace the single NaN in mockup(2,2,2) with -inf.
Any hint is highly appreciated.
This is a good example of something that can be solved using vectorization. I am providing two versions of the code, one that uses the modern syntax (including implicit expansion) and one for older version of MATLAB.
Several things to note:
In the NaN substitution stage, I'm using a "trick" where 0/0 is evaluated to NaN.
In the Inf substitution stage, I'm using logical masking/indexing to access the correct elements in vin.
R2016b and newer:
function vin = ranking (vin, percentTickerAvailable)
% Find percentage of NaNs on each line:
pNaN = mean(isnan(vin), 2, 'double');
% Fills rows with NaNs:
vin = vin + 0 ./ (1 - ( pNaN >= percentTickerAvailable));
% Replace the rest with -Inf
vin(isnan(vin) & pNaN < percentTickerAvailable) = -Inf;
end
Prior to R2016b:
function vin = rankingOld (vin, percentTickerAvailable)
% Find percentage of NaNs on each line:
pNaN = mean(isnan(vin), 2, 'double');
% Fills rows with NaNs:
vin = bsxfun(#plus, vin, 0 ./ (1 - ( pNaN >= percentTickerAvailable)));
% Replace the rest with -Inf
vin(bsxfun(#and, isnan(vin), pNaN < percentTickerAvailable)) = -Inf;
end
1)
The percentage of NaN in any given row should be smaller than 1
... Are you talking about ratio? In which case this is a useless check, as it will always be the case. Or talking about percentages? In which case your code doesn't do what you describe. My guess is ratio.
2) Based on my guess, I have a follow up question: following your description, shouldn't mockup(2,2,2) stay NaN? There is 33% (<50%) of NaN in that row, so it does not fulfill your condition 2.
3) Based on the answers I deemed logical, I would have changed percentNaN = sum(isnan(vin), 2) / size(vin, 2); for readability, and NaNIdx = percentNaN > percentTickerAvailable; accordingly. Now just add one line in front of your loop:
vin(isnan(vin)) = -inf;
Why? Because like this you replace all the NaNs by -inf. Later on, the ones that respect condition 1 will be overwritten to NaN again, by the loop. You don't need the InfIdx.
4) Be aware that your function cannot return vout as of now. Just let it return vin, and you'll be fine.
You can also use logical indexing to achieve this task:
x(:,:,1) = ...
[100, 100, 100; ...
103, 95, 100; ...
101, 85, 100; ...
96, 90, 102; ...
91, 89, 99; ...
97, 91, 97; ...
105, 83, 100];
x(:,:,2) = ...
[50, NaN, NaN; ...
47, NaN, 40; ...
45, 60, 45; ...
47, 65, 45; ...
51, 70, 45; ...
54, 65, 50; ...
62, 80, 55];
% We fix the threshold
tres = 0.5; %fix the threshold.
% We check if a value = NaN or not.
in = isnan(x);
% Which line have more than 50% of NaN ?.
ind = (sum(in,2)./(size(x,2)))>0.5
% We generate an index
[x1,~,x3] = ind2sub(size(ind),ind);
% We set the NaN index to 0 if the line contains less than 50 % of NaN.
in(x1,:,x3) = 0;
% We calculate the new values.
x(in) = -inf;
x(x1,:,x3) = NaN;
I want to numerically integrate an integral with infinite limit. Does anyone have any idea how should I do that?
int(x* exp (v*x + (1-exp(v*x))/v),x, o , inf) does not work.
Note that I will have values for v.
%n=10;
kappa=.5;
delta0=.5;
Vmax=500;
Vdep=2.2;
l=2.2;
kbT=4.1;
%xb=.4;
fb=10;
k=1;
V0=5;
e1=(fb*l/kbT)*(kappa/delta0);
e2=Vmax/V0;
e3=Vdep/V0;
w=zeros(1,25);
for v=1:25
w(:,v)=integral(#(x) x.*exp(v*x+((1-exp(v*x))/v)),0,inf);
end
e12=e2*exp(-e1*(1:25).*w.^2)-e3;
plot(e12);
ylim([0 25]);
hold on;
plot(0:25,0:25);
xlim([0 25]);
%hold off;
The plot is not matching the real data in the article!(for the e12 curve specially)
I need to calculate the intersection of the 2 curves (which is ~13.8 based on the paper) and then in the second part I have to add a term in e12 which contains an independent variable:
v=13.8;
w= integral(#(x) x.*exp(v*x+((1-exp(v*x))/v)),0,inf)
e4 = zeros (1,180);
fl = 1:180;
e4(:,fl)= (fl*l/kbT)*(kappa/n);
e12=e2*exp(-e1*v*w^2-e4)-e3
But again the problem is that running this code I will end with a negative value for e12 which should be just approaching zero in large values of fl (fl>160)
to show how this code is different from the expected curve you can plot these data on the same figure:
fl = [0, 1, 4, 9, 15, 20, 25, 40, 60, 80, 100, 120, 140, 160, 180];
e12 = [66, 60, 50, 40, 30, 25.5, 20, 15.5, 10.5, 8.3, 6.6, 5, 2.25, 1.1, 0.5];
which obviously does not match the curve generated by the code.
Assuming the question is about this full code:
syms x;
v = 1; % For example
int(x*exp(v*x + (1-exp(v*x))/v),x, 0, Inf)
and the issue is that it returns itself (i.e., int doesn't find an analytic solution), one can set the 'IgnoreAnalyticConstraints' option to true (more details) to get a solution:
syms x;
v = 1; % For example
int(x*exp(v*x + (1-exp(v*x))/v),x, 0, Inf, 'IgnoreAnalyticConstraints', true)
returns -ei(-1)*exp(1), where ei is the exponential integral function (see also expint for numerical calculations). For negative values of v the solution will also be in terms of eulergamma, the Euler-Mascheroni constant. And of course the integral is undefined if v is 0.
Using Mathematica 10.0.2's Integrate yields a full solution for symbolic v.
Integrate[x Exp[v x - (Exp[v x] - 1)/v], {x, 0, Infinity}]
returns
ConditionalExpression[(E^(1/v) (EulerGamma + Gamma[0, 1/v] + Log[1/v]))/v, Re[v] < 0]
Applying Assumptions:
Integrate[x Exp[v x - (Exp[v x] - 1)/v], {x, 0, Infinity}, Assumptions -> v > 0]
Integrate[x Exp[v x - (Exp[v x] - 1)/v], {x, 0, Infinity}, Assumptions -> v < 0]
returns
(E^(1/v) Gamma[0, 1/v])/v
and
(E^(1/v) (2 EulerGamma - 2 ExpIntegralEi[-(1/v)] + Log[1/v^2]))/(2 v)
where Gamma is the upper incomplete gamma function. These appear match up with the results from Matlab.
To evaluate these numerically in Matlab:
% For v > 0
v_inv = 1./v;
exp(v_inv).*expint(v_inv).*v_inv
or
% For v < 0
v_inv = 1./v;
exp(v_inv).*(2*double(eulergamma)+2*(expint(v_inv)+pi*1i)+log(v_inv.^2)).*v_inv/2
Numerical integration is performed by summing the function at discrete points with distance dx. The smaller dx you choose, the better approximation you get. For example integrating from x=0 to x=10 is done by:
x = 0:dx:10;
I = sum(x.* exp (v*x + (1-exp(v*x))/v))*dx;
obviously, you can't do that for x=inf. But I believe you function decays rapidly. Therefore, you can assume that x* exp (v*x + (1-exp(v*x))/v) = 0 for large enough x. Otherwise the integral is divergent. So all you have to do is set the limit on x. If you are not sure what the limit should be, you can perform a loop with a stopping condition:
I = 0;
prevI = -1;
x = 0;
while abs(I-prevI)>err
prevI = I;
I = I + x.* exp (v*x + (1-exp(v*x))/v)*dx;
x = x + dx;
end
Now, all you have to do is set the desired dx and err
You must read this:Mathwork Link
perhaps you are making a mistake in the function that you use. Also note that MATLAB syntax is case sensitive..