Suppose I have a function in Matlab calc(x,a,b) which outputs a scalar. a and b are constants, x is treated as multivariate. How do I minimize calc(x,a,b) with respect to x in Matlab?
edit: The content of the function creates a vector $v(x)$ and a matrix $A(x)$ and then computes $v(x)'*A(x)^(-1)*v(x)$
This is a fairly general question with a million possible responses depending on what calc is. (For instance, Can you provide gradients for calc? Does x need to take on values in a specific range?)
But, as a start, go for fminunc. It is for functions where you have no gradient information available and you want to find an unconstrained minimum.
Sample Code:
Suppose you want to minimize dot(x,x).
calc = #(x,a,b) dot(x,x)
calc_to_pass_to_fminunc = #(x) calc(x,1,2)
X = fminunc(calc_to_pass_to_fminunc,ones(3,1))
Gives:
Warning: Gradient must be provided for trust-region algorithm;
using line-search algorithm instead.
> In fminunc at 383
Local minimum found.
Optimization completed because the size of the gradient is less than
the default value of the function tolerance.
<stopping criteria details>
X =
0
0
0
The easy answer is: if a and b are constants, and x is a one-dimensional variable, it's a 1-D optimization problem.
The previous answer suggests to usefminunc, which is part of the MATLAB Optimization Toolbox. If you don't have it, you can use fminbnd instead of it, which works just well in case of 1-D optimization in a given interval.
As example, let's say your calc function is:
function [y] = calc(x,a,b)
y = x.^3-2*x-5+a-b;
end
This is what you should do to find the minimum in the interval x1 < x < x2:
% constants
a = 1;
b = 2;
% boundaries of search interval
x1 = 0;
x2 = 2;
x = fminbnd(#(x)calc(x,a,b), x1, x2);
% value of function at the minimum
y = calc(x,y,a);
In the case of the x variable not being a scalar, you could use the analogous of fminbnd for a multidimensional variable: fminsearch, which performs an unconstrained search for the minimum of a multivariate function.
Addendum
fminbnd is a nice tool, but sometimes it's hard to make it behave as you expect. Of course you can specify the desired accuracy and a maximum number of iterations for converging in the options, but in my experience fminbnd might have problems with highly non-linear functions.
In these situations it's desirable to have a finer control on the optimization procedure, and especially on how it's defined the search interval. Given the search interval, arrayfun provides an elegant way to iterate over an array for finding a minimum of the function. Sample code:
% constants
a = 1;
b = 2;
% search interval
xi = linspace(0,2,1000);
yi = arrayfun(#(x)calc(x,a,b), xi);
% value of function at the minimum
[y, idx_m] = min(yi);
% location of minimum
x = xi(idx_m);
The drawback of this approach is that, in order to achieve a high accuracy, you might need a very long array xi. Good thing is that there are several ways to mitigate this issue: for instance, one could use a vector of log-spaced sampling points, or perform a multi-step minimization narrowing and increasing the sampling frequency at each step until the desired accuracy is achieved.
Related
I am trying to solve a system of 12 equations in Matlab. Because I have constraints on the minimum and maximum values of the variables I use lsqnonlin rather than fsolve. However, I would like the optimizer to stop once the output (sum of squared deviations from the point where each equation holds) is sufficiently close to zero. Is there a way to specify such a stopping criterion?
The standard stopping criteria are about comparing the change in the output value compared to previous iteration but this is less relevant for me.
Use the fmincon function to solve the equations with bounded constraints.
Because you have not provided anything, follow the example provided by MATLAB:
fun = #(x)1+x(1)/(1+x(2)) - 3*x(1)*x(2) + x(2)*(1+x(1)); % objective function
lb = [0,0]; % lower bounds
ub = [1,2]; % upper bounds
x0 = (lb + ub)/2; % initial estimate
x = fmincon(fun,x0,[],[],[],[],lb,ub)
This specifies the range 0<x(1)<1 and 0<x(2)<2 for the variables.
The fmincon function also lets you change the default options. To specify the tolerance for the output, set it:
options = optimoptions('fmincon','Display','iter','FunctionTolerance',1e-10);
This sets fmincon options to have iterative display, and to have a FunctionTolerance of 1e-10. Call the fmincon function with these nonstandard options:
x = fmincon(fun,x0,[],[],[],[],lb,ub,[],options)
Given a scalar function handle f(x) defined in a closed interval [a, b], I wish to find the largest zero, i.e. the large value x in [a,b] s.t. f(x)=0.
It seems fzero() does not support functions defined in close intervals, so starting it from fzero(f, b) leads to an error.
So I use fminbnd() with the function g=#(x) (f(x)).^2 but it does not support "initial condition" of x=b or otherwise biasing it toward the largest zero...
According to the docs, fzero() does support closed intervals, except the output values for the intervals must have different signs. So it may or may not work, depending on your function.
Documentation:
Initial value, specified as a real scalar or a 2-element real vector...
2-element vector — fzero checks that fun(x0(1)) and fun(x0(2)) have opposite signs, and errors if they do not. It then iteratively shrinks the interval where fun changes sign to reach a solution. An interval x0 must be finite; it cannot contain ±Inf.
With that said, a quick and dirty way of finding any zero crossing I like to use, is using the sign(), diff(), and find() functions:
x = -10:10; % Any interval you want
y = func(x); % Evaluate your function
d = diff(sign(y)); % Change in signs. Any non-zero values are the places (near) zero crossings
ind = find(d); % Get indices of all non-zero values
xz = x(ind); % Get x's near/at zero crossings.
You can find the largest x where f(x) = 0 by simply choosing from the biggest xz.
Now obviously, this will only give you a crude approximation, and may not be ideal depending on your task; but choose a fine enough interval to test and it should work.
Or, you can use the results of this as a starting point, and find precise values with more advanced numerical algorithms like the Newton-Raphson method.
I'm trying to code a MATLAB program and I have arrived at a point where I need to do the following. I have this equation:
I must find the value of the constant "Xcp" (greater than zero), that is the value that makes the integral equal to zero.
In order to do so, I have coded a loop in which the the value of Xcp advances with small increments on each iteration and the integral is performed and checked if it's zero, if it reaches zero the loop finishes and the Xcp is stored with this value.
However, I think this is not an efficient way to do this task. The running time increases a lot, because this loop is long and has the to perform the integral and the integration limits substitution every time.
Is there a smarter way to do this in Matlab to obtain a better code efficiency?
P.S.: I have used conv() to multiply both polynomials. Since cl(x) and (x-Xcp) are both polynomials.
EDIT: Piece of code.
p = [1 -Xcp]; % polynomial (x-Xcp)
Xcp=0.001;
i=1;
found=false;
while(i<=x_te && found~=true) % Xcp is upper bounded by x_te
int_cl_p = polyint(conv(cl,p));
Cm_cp=(-1/c^2)*diff(polyval(int_cl_p,[x_le,x_te]));
if(Cm_cp==0)
found=true;
else
Xcp=Xcp+0.001;
end
end
This is the code I used to run this section. Another problem is that I have to do it for different cases (different cl functions), for this reason the code is even more slow.
As far as I understood, you need to solve the equation for X_CP.
I suggest using symbolic solver for this. This is not the most efficient way for large polynomials, but for polynomials of degree 20 it takes less than 1 second. I do not claim that this solution is fastest, but this provides generic solution to the problem. If your polynomial does not change every iteration, then you can use this generic solution many times and not spend time for calculating integral.
So, generic symbolic solution in terms of xLE and xTE is obtained using this:
syms xLE xTE c x xCP
a = 1:20;
%//arbitrary polynomial of degree 20
cl = sum(x.^a.*randi([-100,100],1,20));
tic
eqn = -1/c^2 * int(cl * (x-xCP), x, xLE, xTE) == 0;
xCP = solve(eqn,xCP);
pretty(xCP)
toc
Elapsed time is 0.550371 seconds.
You can further use matlabFunction for finding the numerical solutions:
xCP_numerical = matlabFunction(xCP);
%// we then just plug xLE = 10 and xTE = 20 values into function
answer = xCP_numerical(10,20)
answer =
19.8038
The slight modification of the code can allow you to use this for generic coefficients.
Hope that helps
If you multiply by -1/c^2, then you can rearrange as
and integrate however you fancy. Since c_l is a polynomial order N, if it's defined in MATLAB using the usual notation for polyval, where coefficients are stored in a vector a such that
then integration is straightforward:
MATLAB code might look something like this
int_cl_p = polyint(cl);
int_cl_x_p = polyint([cl 0]);
X_CP = diff(polyval(int_cl_x_p,[x_le,x_te]))/diff(polyval(int_cl_p,[x_le,x_te]));
Examples for optimizations with functions like fmincon.m and fminsearchbnd.m usually minimize objective functions that are relatively simple. With simple I mean that the objective function only consists of some algebraic expression, e.g. the Rosenbrock formula.
In my problem, on the other hand, the objective function consists of several steps, including
computing an L2-norm misfit between an observed data point and a set of n training data points (n~5e4)
selecting those data points from the training data set that give the lowest misfit
then using the row indices of this selected subset to compute the final distance that I intend to minimize.
i.e. I perform operations that cannot be formulated as a single mathematical expression. Can I use such an objective function with tools like fminsearchbnd.m or fmincon.m at all? My results so far are not very promising...
There is an easy and obvious solution for that. You fminsearch() to find a minimum for some self-defined functions. In my example, it is fitting a polynomial, which of course is easy, but the trick is, that this could be anything. You can access the data if you make your objective function as a nested function, so they share the same variable scope.
You can start from the following code and fill in everything you want to do part by part and maybe ask followup questions, if any come up.
function main
verbose = 1; % some output
% optimize something, maybe a distorted polynomial
x = sort(rand(20,1));
p_original = [1.5, 3, 2, 1];
y = polyval(p_original,x) + 0.5*(rand(size(x))-0.5);
% optimize polynomial of order order. This is an example of how to pass
% a parameter to the fit function.
order = 3;
% obvious solution is this, but we want to do something else
p_polyfit = polyfit(x,y,order)
% we want to do it a bit more complex
pfit = optimize_something(x, y, order, verbose)
% what is happening?
figure
plot(x,polyval(p_original,x),'k-')
hold on
plot(x,y,'ko')
plot(x,polyval(p_polyfit,x),'rs-')
plot(x,fit_function(x,pfit),'gx-')
legend('original','noisy','polyfit','optimization')
end
function pfit = optimize_something(x,y, order, verbose)
% for polynomial of order order we need order+1 coefficients
p0 = ones(1,order+1); % initial guess: all coefficients are 1
if verbose
fprintf('optimize_something calling fminsearch(#objFun)\n');
end
% hand over only p0 to our objective function
pfit = fminsearch(#objFun, p0);
% ------------------------- NESTED objFUN --------------------------------%
function e = objFun(p)
% This function accepts only p as parameter and returns a value e, which
% will be minimized by some metric (maybe least squares).
% Since this function is nested, it can use also the predefined variables x, y (and also p0 and verbose).
% The magic is, we calculate a value yfitted out of x and p by a
% fit_function. This function can really be anything!
yfitted = fit_function(x, p);
e = sum((yfitted-y).^2);
% e = sum(abs(yfitted-y)); % another possibility
end
% ------------------------- NESTED objFUN --------------------------------%
if verbose
disp('pfit found')
end
end
function yfitted = fit_function(x, p)
% In our example we want to fit a polynomial, so we do so. We evaluate the
% polynomial p at x.
yfitted = polyval(p,x);
% But it could be anything, really.. each value in p could be something
% else, maybe the sum of an exponential function and a straight line
% yfitted = p(1)*exp(p(2)*x) + p(3)*x + p(4);
end
You can try to use CVX. It is an addon for Matlab that lets you describe your optimisation problem with normal Matlab code.
Alternatively, write down your objective function including any constraints. Your description is not clear to me, and it would help you too, if you would write this down in actual formulae.
I read your steps as this:
"Computing an L2-norm between an observed data point and a set of n training data points." It seems that there is a total of one (1) observed data points. Let's call the observed point x. Let's call the training data points y_i for i=1..n.
The L2-Norm is: |x-y_i|.
"Selecting those data points [multiple?] that give the lowest misfit". You haven't said how many data points you want, and how you'd combine multiple points to give a single L2-Norm. Let's assume you want exactly one such point (the closest to the observed data point x). Thus you get: argmin (over i) |x-y_i|. If you have multiple, you could greedily take the k closest points.
"Then using the row indices of this selected subset to compute the final distance that I intend to minimize." And what is the final distance that you intend to minimize?
I have a symbolic function, whose zeros I am particular interested in knowing. I have searched through google, trying to find something related to my query, but was unsuccessful.
Could someone please help me?
EDIT:
T(x,t) = 72/((2*n+1)^2*pi^3)*(1 - (2*n+1)^2*pi^2*t/45 + (2*n+1)^4*pi^4*t^2/(2*45^2) - (2*n+1)^6*pi^6*t^3/(6*45^3))*(2*n+1)*pi*x/3;
for i=1:1:1000
T_new = 72/((2*i+1)^2*pi^3)*(1 - (2*i+1)^2*pi^2*t/45 + (2*i+1)^4*pi^4*t^2/(2*45^2) - (2*i+1)^6*pi^6*t^3/(6*45^3))*(2*i+1)*pi*x/3;
T = T + T_new;
end
T = T - 72/((2*n+1)^2*pi^3)*(1 - (2*n+1)^2*pi^2*t/45 + (2*n+1)^4*pi^4*t^2/(2*45^2) - (2*n+1)^6*pi^6*t^3/(6*45^3))*(2*n+1)*pi*x/3;
T = T(1.5,t);
T_EQ = 0.00001
S = solve(T - T_EQ == 0,t);
The problem that I get is that S is an a vector which contains imaginary numbers. I expected a real number, because I am trying to calculate a time.
Here is a little background as to what I am trying to do:
http://hans.math.upenn.edu/~deturck/m241/solving_the_heat_eqn.pdf
In the given link is the heat equation solved for a particular one-dimensional case. The temperature distribution, that satisfies the prescribed boundary and initial conditions, is given on page 50, I believe.
What I would like to do is find the time at which the one-dimensional object equilibrates with the environment, which is held at a constant temperature of T=0. As far as I know, the easiest way to do this would be to use the Taylor expansion of the exponential function, using only the first few terms, because I expect the equilibrium time to be relatively short; and then use the small angle approximation for the sine function, because the rod has a relatively small length. Doing just this, I made a for loop to generate terms just as the summation function would--as you can see, I used 1000 terms.
Does what I am doing seem wrong to anyone? If there is a better method, could someone please recommend it?
You shouldn't be surprised to see imaginary roots provided that at least one root is real and positive, corresponding to your time. The question is if the time makes any sense due to the approximations that you're making. Have you plotted the the actual function to get a rough approximation for where the zero is?
I can't really comment on the particular problem you're trying to solve. You need to make sure that you're using enough Taylor expansion terms an that they are accurate for the domain. Have you tried this leaving in the exp and/or sin? Is there any reason that you can't just use zero? And have you checked that your summation has converged after 1,000 terms? Or does it converge much sooner or not at all?
The main question is why are you using symbolic math at all to solve this? This seems like a numeric problem unless you're experiencing overflow/underflow issues in your summation. You can find the zero using fzero in this case:
N = 32; % Number of terms in summation
x = 1.5;
T_EQ = 1e-5;
n = (2*(0:N)+1)*pi;
T = #(t)sum((72./n.^3).*exp(-n.^2*t/45).*sin(n*x/3))-T_EQ;
S = fzero(T,[0 1e3]) % Bounds around a root guarantees solution if function monotonic
which returns
S =
56.333877640358708
If you're going to use solve, I'd do something like the following to avoid for loops:
syms t
N = 32;
x = 1.5;
T_EQ = 1e-5;
n = (2*sym(0:N)+1)*sym(pi);
T(t) = sum((72./n.^3).*exp(-n.^2*t/45).*sin(n*x/3));
S = double(solve(T-T_EQ==0,t))
or, using symsum:
syms n t
N = 32;
x = 1.5;
T_EQ = 1e-5;
T(t) = symsum((72/(pi*(2*n+1))^3)*exp(-(pi*(2*n+1))^2*t/45)*sin(pi*(2*n+1)*x/3),n,0,N);
S = double(solve(T-T_EQ==0,t))
Lastly, your symbolic solutions are not even exact as some your pi variables are being converted to rational approximations. pi is floating point. Things like pi*t are generally safe if t is symbolic, because pi will be recognized as such. However, pi^2 is calculated in floating-point before being converted to symbolic due to order of operations. In general your should use sym('pi') or sym(pi) in symbolic expressions.
Assuming you have a polynomial or trigonometric function of x or y, and what you mean by "zeros" is the values where the function crosses the axis, i.e., either x or y is zero, you can call the value of the function when a variable is 0. An example:
syms x y
f=-cos(x)*exp(-(x^2)/40);
ezsurf(f,[-10,10])
F=matlabFunction(f,'vars',{[x]});
F([0])
The ezsurf just visualizes the plot. If you want a function of both x and y, you do something like the following:
syms x y
f=-cos(x)*cos(y)*exp(-(x^2+y^2)/40);
ezsurf(f,[-10,10])
F=matlabFunction(f,'vars',{[x,y]});
for y=0
solve(f)
end
This will give you the value of the function for which integer multiples of x correspond to zero points for y (values of the function that are on the y=0 plane).