This may be really simple question for MATLAB fmincon users:
I have a function Y = AX, where A is a vector 1 x N of constants, Y is a scalar constant, and X is a vector N x 1; I need to find the optimal values of X such that
Y - A*X = 0. Initial values for X come from a N x 1 vector X0. Also
X0(1) < X0(2) < ... X0(N)
The constraints are:
0 <= X(1);
X(1) < X(2);
X(2) < X(3);
...
...
X(N-1) < X(N);
X(N) <= 1;
and
X0(1) <= X(1);
X0(2) <= X(2);
X0(3) <= X(3);
...
...
X0(N-1) <= X(N-1);
My attempt at this problem was:
[X, fval] = fmincon(#(X)Y - A*X, X0,[],[],[],[],X0,[X(2:end);1],[],options);
I don't think the results I get are correct with this method.
Another attempt was this:
[X, fval] = fmincon(#(X)Y - A*X, X0,AA,zeros(N-1,1),[],[],[],[],[],options);
where
AA = [1 -1 0 0 0 ... N;
0 1 -1 0 0 ... N;
.
.
0 0 0 0 0 ... 1 -1]; (N-1 Rows)
with failure as well.
Any suggestions, hints would be very welcome! I hope I have given enough information regarding the problem.
Following Shai's suggestion I have tried this :
[X, fval] = fmincon(#(X) abs(Y - A*X), X0,AA,eps(0)*ones(N-1,1),[],[],X0,[],[],options);
But no success. After exactly N iterations the solution converges to X0.
I used the abs to get Y - AX to minimize to 0, and X0 in the lower bound to satisfy the conditions X0(1) < X(1) etc..
Thanks Shai, using the artificial / dummy objective function, was a great idea. However when I use linprog in the following way :
[X, fval] = linprog( -(1:N), AA, -eps(0)*ones(N-1,1), ...
A, Y, X0, [], X0, options );
I get the problem that Y is a scalar and A is still a N x 1 vector. and linprog expects Y to be a vector as well. That changes the problem of course. I set the lower bound to X0, upper bound empty (since the inequality constraint takes care of it), and set the initial values to X0. So still not working. Will update if any resolution occurs.
I can see no reason why we cannot find an X for which A*X = Y exactly (there should be enough degrees of freedom for sufficiently large N). So instead of making |A*X-Y| the objective function to be minimize, I suggest adding A*X=Y as a constraint.
Moreover, to encourage Xi < Xj for i<j I suggest an "artificial" objective function -(1:N) that will put larger weight on Xj for large j.
[X, fval] = fmincon( #(X) -(1:N)*X(:), X0, AA, -eps(0)*ones(N-1,1), ...
A, Y, zeros(N,1), ones(N,1), [], options);
And while we are at it, why aren't you using linear programing - this should work better than the general purpose fmincon as it is tailored for the linear case?
[X, fval] = linprog( -(1:N), AA, -eps(0)*ones(N-1,1), ...
A, Y, zeros(N,1), ones(N,1), X0, options );
Related
Sorry to bother you, but I'm fighting with the manual of Octave without any result.
I would like to maximize a function a little bit complex with some constraint :
The function is mathemathically similar to this one (I write this one to simplify the explanations) :
f(x, y, z, t) = arcsin (x/2t)/(y + x + max (1, z/t))
with z, t between 0 and 1, x between 1 and 2, and y greater than 1/x^2.
Could you give me the code to compute the numerical value for x, y, z and t to maximize this function. From this code, I will derive how the optimisation function should be used.
It will help me a lot.
Thank you very much
you can use fmincon and minimize the additive inverse of your objective function. your constraint ("y greater than 1/x^2") is nonlinear so you should use the nonlcon argument of fmincon:
% function definition (minus sign to maximize instead of minimize)
f = #(x) - asin(x(1)/(2*x(4)))/(x(2) + x(1) + max(1, x(3)/x(4)) );
lb = [1 -inf 0 0];% lower bound for [x y z t]
ub = [2 inf 1 1];% upper bound for [x y z t]
x0 = [1.5 0 0.5 0.9]; % initial vector
% minimization
x = fmincon(f,x0,[],[],[],[],lb,ub,#mycon);
where mycon defines your constraint:
function [c,ceq] = mycon(x)
% y > 1/x^2
% 1/x^2 - y < 0
c = 1/x(1)^2 - x(2);
ceq = 0;
end
you can also pass an options argument to specify optimization options.
I am trying to fit a piecewise linear equation for my (xdata, ydata) data. I have to challenges, the first one is how to convert the equation in the form of the function handle and the second one is how to put a constraint on the slope, for instance, a2>a1 and a2>0 and a1>0.
xdata = 5:0.2:40;
ydata = max(18,xdata) + 0.5*randn(size(xdata));
a1 = (y1-y0)/(x1-x0); a2 = (y2-y1)/(x2-x1);
if x < x1;
f(x) = y0 + a1*(x-x0);
else
f(x) = y0 + a1*(x1-x0) + a2*(x-x1);
end
FU = matlabFunction(f)
x0 = 5; y0 = 16;
x = lsqcurvefit(FU,[x0,y0],xdata,ydata)
The key to creating the piece-wise function is to replace the if condition by a vectorized >. By calling y = x > 1 on some array x, the output y will be an array of the same size as x, with a logical True if the corresponding element in x is larger than 1, and a False otherwise. For example
>> x = [1, 2, 4; 3, 1, 2];
>> y = x > 2
y =
2×3 logical array
0 0 1
1 0 0
You can utilize this to create a piece-wise linear function, as follows:
>> fun = #(theta, xdata) theta(1) + ...
(xdata<=theta(2)) .* theta(3) .* xdata + ...
(xdata>theta(2)) .* (theta(3) * theta(2) + ...
theta(4) .* (xdata-theta(2)))
The parameter vector theta will be 4-dimensional: the first element is a constant offset from zero, the second element is the corner point, and the third and fourth elements are the two slopes.
By multiplying theta(3).*xdata with the result of xdata<=theta(2), you get theta(3).*xdata for each point in xdata which is smaller than theta(2), and 0 for all others.
Then, calling lsqcurvefit is as simple as
>> theta = lsqcurvefit(fun, [0; 15; 0; 1], xdata, ydata)
theta =
18.3793
17.9639
-0.0230
0.9943
The lsqcurvefit function also allows you to specify a lower bound lb and an upper bound ub for the variables you want to estimate. For variables where you don't want to specify a bound, you can use e.g. inf as bound. To make sure that your a1 and a2, i.e. theta(3) and theta(4) are positive, we can specify the lower bound to be [-inf, -inf, 0, 0].
However, the lsqcurvefit function doesn't allow you to add the constraint a2 > a1 (or any linear inequality constraints). In the example data, this constraint probably isn't even necessary as this is obvious from the data. Otherwise, a possible solution would be to replace a2 by a1 + da, and use a lower bound of 0 for da. This makes sure that a2 >= a1.
>> fun = #(theta, xdata) theta(1) + ...
(xdata<=theta(2)) .* theta(3) .* xdata + ...
(xdata>theta(2)) .* (theta(3) * theta(2) + ...
(theta(3)+theta(4)) .* (xdata-theta(2)))
>> theta = lsqcurvefit(fun, [0; 15; 0; 1], xdata, ydata, [-Inf, -Inf, 0, 0], [])
theta =
18.1162
18.1159
0.0000
0.9944
I just started taking Andrew Ng's course on Machine Learning on Coursera.
The topic of the third week is logistic regression, so I am trying to implement the following cost function.
The hypothesis is defined as:
where g is the sigmoid function:
This is how my function looks at the moment:
function [J, grad] = costFunction(theta, X, y)
m = length(y); % number of training examples
S = 0;
J = 0;
for i=1:m
Yi = y(i);
Xi = X(i,:);
H = sigmoid(transpose(theta).*Xi);
S = S + ((-Yi)*log(H)-((1-Yi)*log(1-H)));
end
J = S/m;
end
Given the following values
X = [magic(3) ; magic(3)];
y = [1 0 1 0 1 0]';
[j g] = costFunction([0 1 0]', X, y)
j returns 0.6931 2.6067 0.6931 even though the result should be j = 2.6067. I am assuming that there is a problem with Xi, but I just can't see the error.
I would be very thankful if someone could point me to the right direction.
You are supposed to apply the sigmoid function to the dot product of your parameter vector (theta) and input vector (Xi, which in this case is a row vector). So, you should change
H = sigmoid(transpose(theta).*Xi);
to
H = sigmoid(theta' * Xi'); % or sigmoid(Xi * theta)
Of course, you need to make sure that the bias input 1 is added to your inputs (a row of 1s to X).
Next, think about how you can vectorize this entire operation so that it can be written without any loops. That way it would be considerably faster.
function [J, grad] = costFunction(theta, X, y)
m = length(y);
J = 0;
grad = zeros(size(theta));
J=(1/m)*((-y'*(log(sigmoid(X*theta))))-((1-y)'*(log(1-(sigmoid(X*theta))))));
grad=(1/m)*(X'*((sigmoid(X*theta))-y));
end
The above code snippets works perfectly fine for Logistic Regrssion Cost and Gradient functions given the sigmoid function is working fine.
I am trying to write a function that implements Newton's method in two dimensions and whilst I have done this, I have to now adjust my script so that the input parameters of my function must be f(x) in a column vector, the Jacobian matrix of f(x), the initial guess x0 and the tolerance where the function f(x) and its Jacobian matrix are in separate .m files.
As an example of a script I wrote that implements Newton's method, I have:
n=0; %initialize iteration counter
eps=1; %initialize error
x=[1;1]; %set starting value
%Computation loop
while eps>1e-10&n<100
g=[x(1)^2+x(2)^3-1;x(1)^4-x(2)^4+x(1)*x(2)]; %g(x)
eps=abs(g(1))+abs(g(2)); %error
Jg=[2*x(1),3*x(2)^2;4*x(1)^3+x(2),-4*x(2)^3+x(1)]; %Jacobian
y=x-Jg\g; %iterate
x=y; %update x
n=n+1; %counter+1
end
n,x,eps %display end values
So with this script, I had implemented the function and the Jacobian matrix into the actual script and I am struggling to work out how I can actually create a script with the input parameters required.
Thanks!
If you don't mind, I'd like to restructure your code so that it is more dynamic and more user friendly to read.
Let's start with some preliminaries. If you want to make your script truly dynamic, then I would recommend that you use the Symbolic Math Toolbox. This way, you can use MATLAB to tackle derivatives of functions for you. You first need to use the syms command, followed by any variable you want. This tells MATLAB that you are now going to treat this variable as "symbolic" (i.e. not a constant). Let's start with some basics:
syms x;
y = 2*x^2 + 6*x + 3;
dy = diff(y); % Derivative with respect to x. Should give 4*x + 6;
out = subs(y, 3); % The subs command will substitute all x's in y with the value 3
% This should give 2*(3^2) + 6*3 + 3 = 39
Because this is 2D, we're going to need 2D functions... so let's define x and y as variables. The way you call the subs command will be slightly different:
syms x, y; % Two variables now
z = 2*x*y^2 + 6*y + x;
dzx = diff(z, 'x'); % Differentiate with respect to x - Should give 2*y^2 + 1
dzy = diff(z, 'y'); % Differentiate with respect to y - Should give 4*x*y + 6
out = subs(z, {x, y}, [2, 3]); % For z, with variables x,y, substitute x = 2, y = 3
% Should give 56
One more thing... we can place equations into vectors or matrices and use subs to simultaneously substitute all values of x and y into each equation.
syms x, y;
z1 = 3*x + 6*y + 3;
z2 = 3*y + 4*y + 4;
f = [z1; z2];
out = subs(f, {x,y}, [2, 3]); % Produces a 2 x 1 vector with [27; 25]
We can do the same thing for matrices, but for brevity I won't show you how to do that. I will defer to the code and you can see it then.
Now that we have that established, let's tackle your code one piece at a time to truly make this dynamic. Your function requires the initial guess x0, the function f(x) as a column vector, the Jacobian matrix as a 2 x 2 matrix and the tolerance tol.
Before you run your script, you will need to generate your parameters:
syms x y; % Make x,y symbolic
f1 = x^2 + y^3 - 1; % Make your two equations (from your example)
f2 = x^4 - y^4 + x*y;
f = [f1; f2]; % f(x) vector
% Jacobian matrix
J = [diff(f1, 'x') diff(f1, 'y'); diff(f2, 'x') diff(f2, 'y')];
% Initial vector
x0 = [1; 1];
% Tolerance:
tol = 1e-10;
Now, make your script into a function:
% To run in MATLAB, do:
% [n, xout, tol] = Jacobian2D(f, J, x0, tol);
% disp('n = '); disp(n); disp('x = '); disp(xout); disp('tol = '); disp(tol);
function [n, xout, tol] = Jacobian2D(f, J, x0, tol)
% Just to be sure...
syms x, y;
% Initialize error
ep = 1; % Note: eps is a reserved keyword in MATLAB
% Initialize counter
n = 0;
% For the beginning of the loop
% Must transpose into a row vector as this is required by subs
xout = x0';
% Computation loop
while ep > tol && n < 100
g = subs(f, {x,y}, xout); %g(x)
ep = abs(g(1)) + abs(g(2)); %error
Jg = subs(J, {x,y}, xout); %Jacobian
yout = xout - Jg\g; %iterate
xout = yout; %update x
n = n + 1; %counter+1
end
% Transpose and convert back to number representation
xout = double(xout');
I should probably tell you that when you're doing computation using the Symbolic Math Toolbox, the data type of the numbers as you're calculating them are a sym object. You probably want to convert these back into real numbers and so you can use double to cast them back. However, if you leave them in the sym format, it displays your numbers as neat fractions if that's what you're looking for. Cast to double if you want the decimal point representation.
Now when you run this function, it should give you what you're looking for. I have not tested this code, but I'm pretty sure this will work.
Happy to answer any more questions you may have. Hope this helps.
Cheers!
I want to use Gurobi solver in Matlab, but I don't know how to calculate the required matrices (qrow and qcol).
For your reference I am copying the example provided in documentation.
0.5 x^2 - xy + y^2 - 2x - 6y
subject to
x + y <= 2
-x + 2y <= 2, 2x + y <= 3, x >= 0, y >= 0
c = [-2 -6]; % objective linear term
objtype = 1; % minimization
A = sparse([1 1; -1 2; 2 1]); % constraint coefficients
b = [2; 2; 3]; % constraint right-hand side
lb = []; % [ ] means 0 lower bound
ub = []; % [ ] means inf upper bound
contypes = '$<<
vtypes = [ ]; % [ ] means all variables are continuous
QP.qrow = int32([0 0 1]); % indices of x, x, y as in (0.5 x^2 - xy + y^2); use int64 if sizeof(int) is 8 for you system
QP.qcol = int32([0 1 1]); % indices of x, y, y as in (0.5 x^2 - xy + y^2); use int64 if sizeof(int) is 8 for you system
QP.qval = [0.5 -1 1]; % coefficients of (0.5 x^2 - xy + y^2)
Does it mean that if I have 4 decision variables than i should use 0,1,2,3 as indices for my decision variables x_1, x_2, x_3, x_4.?
Thanks
Note: I tried to use mathurl.com but I don't get how to write in proper format show that it will appear as latex text. Sorry for the notation.
This seems to be your reference. However your question seems to relate different example. You may need to show that one.
Anyway according Gurobi documentation:
The quadratic terms in the objective function should be specified by opts.QP.qrow, opts.QP.qcol, and opts.QP.qval, which correspond to the input arguments qrow, qcol, and qval of function GRBaddqpterms. They are all 1D arrays. The first two arguments, qrow and qcol, specify the row and column indices (starting from 0) of 2nd-order terms such as and . The third argument, qval, gives their coefficients.
So the answer is yes use indicies [0 1 2 3] for your decision variables x0, x1, x2, x3.