Matlab optimization - conditional sum of optimization variable in constraint - matlab

I'm trying to solve a ILP problem using binary variables in MATLAB (bintprog). I need to define a constraint similar as below :
I want to sum binary variables x<sub>i, j</sub> until one of them equals to zero and after that stop this summation. How can I define this constraint in ILP?

The way I read this (somewhat difficult as there is little explanation; I think the question could have been phrased more clearly), this is essentially counting the first x(k)'s that are 1. Here I assume x(i,j) is mapped to x(k) so we have a better defined ordering. The counting can be done using a new binary variable y(k):
y(k) = x(k)*y(k-1) if k>1
y(k) = x(k) if k=1
y(k) in {0,1}
For example:
x = [1 1 1 0 1 0]
y = [1 1 1 0 0 0]
The nonlinear expression x(k)*y(k-1) can easily be linearized (it is a multiplication of two binary variables; see here; note that with this formulation we can even relax y(k) to continuous between 0 and 1). An optimized version can look like:
y(k) >= x(k)+y(k-1)-1 if k>1
y(k) = x(k) if k=1
0 <= y(k) <= 1
Now you just can do
sum(k, y(k)) <= c
This formulation would also allow to say: number of leading 1s should be between c1 and c2.
We can also just write immediately:
x(1) + ... + x(c+1) <= c
which also forbids more than c leading 1s and is even simpler.
A different problem we see more often (e.g. in generator scheduling) is the condition that we cannot have more than c consecutive x(k)'s turned on (i.e. the generator can not be on for more than c consecutive periods: after c periods on, the generator must be turned off for at least 1 period). This means we don't allow sequences of more than c 1's.

This is quite a non-typical MIP-constraint in my opinion. Here is my attempt in formulating this:
You have to think about:
the necessity of handling Case B (dependent on your objective and other constraints) and the downfalls of the bigM-approach nice read
how to implement the product of two binary-variables (which is easy; see here)
EDIT
At the cost of the double amount of binary-auxiliary variables, it is also possible to formulate this without bigM-constants.
Sketch:
Instead of using a one-sided implication to set the indicators, use equality == both-implications -> no aux-var is free; all are 0 or 1 depending on x(z) only
Now build a second layer of binary-aux variables (b_i) with the following idea: b_i >= smallConst * b_i-1 + smallConst * a_i (smallConst in (0, 0.5]; should not be close 0 )
Use b-vector for your final sum-constraint (like describes above; but not using a-vector anymore!)

Related

FIR filter length is the intercept included as a coefficient?-- Matlab

I have some confusion about the terminologies and simulation of an FIR system. I shall appreciate help in rectifying my mistakes and informing what is correct.
Assuming a FIR filter with coefficient array A=[1,c2,c3,c4]. The number of elements are L so the length of the filter L but the order is L-1.
Confusion1: Is the intercept 1 considered as a coefficient? Is it always 1?
Confusion2: Is my understanding correct that for the given example the length L= 4 and order=3?
Confusion3: Mathematically, I can write it as:
where u is the input data and l starts from zero. Then to simulate the above equation I have done the following convolution. Is it correct?:
N =100; %number of data
A = [1, 0.1, -0.5, 0.62];
u = rand(1,N);
x(1) = 0.0;
x(2) = 0.0;
x(3) = 0.0;
x(4) = 0.0;
for n = 5:N
x(n) = A(1)*u(n) + A(2)*u(n-1)+ A(3)*u(n-3)+ A(4)*u(n-4);
end
Confusion1: Is the intercept 1 considered as a coefficient? Is it always 1?
Yes it is considered a coefficient, and no it isn't always 1. It is very common to include a global scaling factor in the coefficient array by multiplying all the coefficients (i.e. scaling the input or output of a filter with coefficients [1,c1,c2,c2] by K is equivalent to using a filter with coefficients [K,K*c1,K*c2,K*c3]). Also note that many FIR filter design techniques generate coefficients whose amplitude peaks near the middle of the coefficient array and taper off at the start and end.
Confusion2: Is my understanding correct that for the given example the length L= 4 and order = 3?
Yes, that is correct
Confusion3: [...] Then to simulate the above equation I have done the following convolution. Is it correct? [...]
Almost, but not quite. Here are the few things that you need to fix.
In the main for loop, applying the formula you would increment the index of A and decrement the index of u by 1 for each term, so you would actually get x(n) = A(1)*u(n) + A(2)*u(n-1)+ A(3)*u(n-2)+ A(4)*u(n-3)
You can actually start this loop at n=4
The first few outputs should still be using the formula, but dropping the terms u(n-k) for which n-k would be less than 1. So, for x(3) you'd be dropping 1 term, for x(2) you'd be dropping 2 terms and for x(1) you'd be dropping 3 terms.
The modified code would look like the following:
x(1)=A(1)*u(1);
x(2)=A(1)*u(2) + A(2)*u(1);
x(3)=A(1)*u(3) + A(2)*u(2) + A(3)*u(1);
for n = 4:N
x(n) = A(1)*u(n) + A(2)*u(n-1)+ A(3)*u(n-2)+ A(4)*u(n-3);
end

How to minimize a function with the constraint that its derivative should be always greater than 0

I am trying to optimize a nonlinear function with 1500 variables(instantaneous phase), with the help of fmincon in Matlab. The constraint to the optimum variable is that the difference between consecutive elements in the optimal variable obtained should be greater than 0. How can I implement this in the cost function? I have used a nonlinear constraint:
function [c,ceq] = insta_freq(phase)
f=diff(phase);
c=-1*double(min(f));
ceq = [];
The optimization is performed by:
nonlcon=#insta_freq;
[variable_opt,fval,exitflag,output] = fmincon(fun,ph0,[],[],[],[],[],[],nonlcon,options);
The optimization should be such that the constraint nonlcon<=0 but while optimizing with fmincon, these constraints are not satisfied. Thus, is there any other way to make sure that difference of the optimal variable vector is always greater than 0?
You could try and reduce the constraint tolerance. Also in the question it seems you are referring to the derivatives of the objective function, whereas in the question itself it seems you want every single term to be greater than the preceding one as in x1 <= x2 <= x3 <= ... <= xn. I am suggesting a possible solution for the latter problem (the first one would not even define a local optimum, so I am assuming the reported condition is what you want).
You could rewrite the condition in a matrix for as in
A = [ 1 -1 0 ... 0 ; 0 1 -1 0 ... 0 ; .... 1 -1 ] so your constraints are inequality linear constraints, simply written Aineq x <= b where b = [0;...; 0];
you just then call
[variable_opt,fval,exitflag,output] = fmincon(fun,ph0,A,b,[],[],[],[],nonlcon,options);
where A and b are the ones defined above.

MATLAB genetic algorithm constraint (all variables can't be zero at the same time in a binary environment)

I'm using MATLAB ga function for my optimization problem. In my problem, I have some decision variables which are integer (0 and 1: I specified lower bound, upper bound, and IntCon for it) plus two continues varibales. Otherwise, all integer variables can't be zero at same time, so at least, we need a single "one" variable among integers. How can I implement mentioned constraint in MATLAB?
This is a Mixed-Integer optimization problem and it can be solved using ga in MATLAB. As mentioned in the documentations: ga can solve problems when certain variables are integer-valued. Not all the variables but certain variables. So you should have at least one real variable among the other integers.
Whit IntCon options, you can specify which variables are integer, for instance IntCon=[1 3] means that your first and third variables are integer. To avoid both integer variables to be 0 at the same time, I think you can add some inequality constraints.
For instance look at the following example:
Let's say we want to find the optimum value for the Ackley function with 5 variables (e.g. in 5 dimensions), [x(1)...x(5)] and let's assume that the first and third variables, x(1) and x(3), are integers. We can write the following script:
nVar = 5;
lb = -5*ones(1,nVar); % define the upper bound
ub = 5*ones(1,nVar); % define the lower bound
rng(1,'twister') % for reproducibility
opts = optimoptions('ga','MaxStallGenerations',50,'FunctionTolerance',1e-3,'MaxGenerations',300);
[x,~,~] = ga(#ackleyfcn,nVar,[],[],[],[],lb,ub,[],[1 3],opts);
disp('solution:');disp(x)
On my machine, I get this solution:
solution:
0 -0.000000278963321 0 0.979067345808285 -0.000000280775000
It can be seen that x(1) and x(3) are integers and both 0. Now let's say as you mentioned, they both cannot be 0 at the same time and if one is 0 the other should be 1. Here the boundaries of the Ackley's problem allows the variables to be in the range defined by lower and upper bounds. However, in your case the lower and upper bounds should be defined as [0] and [1] for both integer variables.
Now I want to avoid both variables to be 0, so I can write the following linear inequality constraint:
% x(1) + x(3) >= 1
% x(1) >= 0
% x(3) > 0
These inequalities should be written in form Ax <= b:
A = [-1 0 -1 0 0
-1 0 0 0 0
0 0 -1 0 0];
b = [-1
0
0];
Now if we run the optimization problem again we see the effect of the constraints on the output:
[x,~,~] = ga(#ackleyfcn,nVar,A,b,[],[],lb,ub,[],[1 3],opts);
disp('solution');disp(x)
solution
1.000000000000000 -0.000005031565831 0 -0.000011740569861 0.000008060759466

best way to obtain one answer that satisfy a linear equation in matlab

I have a linear equation:
vt = v1*x1 + v2*x2 + v3*x3
vt, v1, v2, v3 are scalars with values between 0 and 1. What is the best way to generate one set (any set will be fine) of x1, x2 and x3 that satisfy the equation above. and also satisfy
x1>0
x2>0
x3>0
I have couple thousand sets of vt,v1,v2 and v3, therefore I need to be able to generate x1, x2 and x3 programmatically.
There are two ways you could approach this:
From the method that you have devised in your post. Randomly generate x1 and x2 and ensure that vt < v1*x1 + v2*x2, then go ahead and solve for x3.
Formulate this into linear program. A linear program is basically solving a system of equations that are subject to inequality or equality constraints. In other words:
As such, we can translate your problem to be of a linear programming problem. The "maximize" statement is what is known as the objective function - the overall goal of what you are trying to accomplish. In linear programming problems, we are trying to minimize or maximize this objective. To do this, we must satisfy the inequalities seen in the subject to condition. Usually, this program is represented in canonical form, and so the constraints on each variable should be positive.
The maximize condition can be arbitrary as you don't care about the objective. You just care about any solution. This whole paradigm can be achieved by linprog in MATLAB. What you should be careful with is how linprog is specified. In fact, the objective is minimized instead of maximized. The conditions, however, are the same with the exception of ensuring that all of the variables are positive. We will have to code that in ourselves.
In terms of the arbitrary objective, we can simply do x1 + x2 + x3. As such, c = [1 1 1]. Our equality constraint is: v1*x1 + v2*x2 + v3*x3 = vt. We also must make sure that x is positive. In order to code this in, what we can do is choose a small constant so that all values of x are greater than this value. Right now, linprog does not support strict inequalities (i.e. x > 0) and so we have to circumvent this by doing this trick. Also, to ensure that the values are positive, linprog assumes that the Ax <= b. Therefore, a common trick that is used is to negate the inequality of x >= 0, and so this is equivalent to -x <= 0. To ensure the values are non-zero, we would actually do: -x <= -eps, where eps is a small constant. However, when I was doing experiments, by doing it this way, two of the variables end up to be the same solution. As such, what I would recommend we do is to generate good solutions that are random each time, let's draw b to be from a uniform random distribution as you said. This will then give us a starting point every time we want to solve this problem.
Therefore, our inequalities are:
-x1 <= -rand1
-x2 <= -rand2
-x3 <= -rand3
rand1, rand2, rand3 are three randomly generated numbers that are between 0 and 1. In matrix form, this is:
[-1 0 0][x1] [-rand1]
[0 -1 0][x2] <= [-rand2]
[0 0 -1][x3] [-rand3]
Finally, our equality constraint from before is:
[v1 v2 v3][x1] [vt]
[x2] =
[x3]
Now, to use linprog, you would do this:
X = linprog(c, A, b, Aeq, beq);
c is a coefficient array that is defined for the objective. In this case, it would be defined as [1 1 1], A and b is the matrix and column vector defined for the inequality constraints and Aeq and beq is the matrix and column vector defined for the equality constraints. X would thus give us the solution after linprog converges (i.e. x1, x2, x3). As such, you would do this:
A = -eye(3,3);
b = -rand(3,1);
Aeq = [v1 v2 v3];
beq = vt;
c = [1 1 1];
X = linprog(c, A, b, Aeq, beq);
As an example, suppose v1 = 0.33, v2 = 0.5, v3 = 0.2, and vt = 2.5. Therefore:
rng(123); %// Set seed for reproducibility
v1 = 0.33; v2 = 0.5; v3 = 0.2;
vt = 2.5;
A = -eye(3,3);
b = -rand(3,1);
Aeq = [v1 v2 v3];
beq = vt;
c = [1 1 1];
X = linprog(c, A, b, Aeq, beq);
I get:
X =
0.6964
4.4495
0.2268
To verify that this equals vt, we would do:
s = Aeq*X
s = 2.5000
The above simply does v1*x1 + v2*x2 + v3*x3. This is computed in a dot product form to make things easy as X is a column vector and v1, v2, v3 are already set in Aeq and is a row vector.
As such, either way is good, but at least with linprog, you don't have to keep looping until you get that condition to be satisfied!
Small Caveat
One small caveat that I forgot to mention in the above approach is that you need to make sure that vt >= v1*rand1 + v2*rand2 + v3*rand3 to ensure convergence. Since you said that v1,v2,v3 are bounded between 0 and 1, the worst case is when v1,v2,v3 are all equal to 1. As such, we really need to make sure that vt > rand1 + rand2 + rand3. If this is not the case, then simply take each value of rand1, rand2, rand3, and divide by (rand1 + rand2 + rand3) / vt. As such, this will ensure that the total summation will equal vt assuming that all of the weights are 1, and this will allow the linear program to converge properly.
If you don't, then the solution will not converge due to the inequality conditions placed in for b, and you won't get the right answer. Just some food for thought! As such, do this for b before you run linprog
if sum(-b) > vt
b = b ./ (sum(-b) / vt);
end
Good luck!

How can I generate correlated data in MATLAB, with a prespecified SD and mean?

I wish to create one vector of data points with a mean of 50 and a standard deviation of 1. Then, I wish to create a second vector of data points again with a mean of 50 and a standard deviation of 1, and with a correlation of 0.3 with the first vector. The number of data points doesn't really matter but ideally I would have 100.
The method mentioned at Generating two correlated random vectors does not answer my question because (due to random sampling) the SDs and means deviate too much from the desired number.
I worked out a way, though it is ugly. I would still welcome an answer that detailed a more elegant method to get what I want.
z = 0;
while z < 1
mu = 50
sigma = 1
M = mu + sigma*randn(100,2);
R = [1 0.3; 0.3 1];
L = chol(R)
M = M*L;
x = M(:,1);
y = M(:,2);
if (corr(x,y) < 0.301 & corr(x,y) > 0.299) & (std(x) < 1.01 & std(x) > 0.99) & (std(y) < 1.01 & std(y) > 0.99);
z = 1;
end
end
I then calculated how the mean of vector y and calculated how much higher than 50 it was. I then subtracted that number from every element in vector y so that the mean was reduced to 50.
You can create both vectors together... I dont understand the reason you define them separatelly. This is the concept of multivariate distribution (just to be sure that we have the same jargon)...
Anyway, I guess you are almost already there to what I call the simplest way to do that:
Method 1:
Use matlab function mvnrnd [Remember that mvnrnd uses the covariance matrix that can be calculated from the correlation and the variance)
Method 2:
I am not very sure, but I think it is very close to what you are doing (actually my doubt is related to the if (corr(x,y) < 0.301 & corr(x,y) > 0.299) & (std(x) < 1.01 & std(x) > 0.99) & (std(y) < 1.01 & std(y) > 0.99)) I dont understand the reason you have to do that. See the topic "Drawing values from the distribution" in wikipedia Multivariate normal distribution.