Optimization algorithm in Matlab - matlab

I want to calculate maximum of the function CROSS-IN-TRAY which is
shown here:
So I have made this function in Matlab:
function f = CrossInTray2(x)
%the CrossInTray2 objective function
%
f = 0.0001 *(( abs(sin(x(:,1)).* sin(x(:,2)).*exp(abs(100 - sqrt(x(:,1).^2 + x(:,2).^2)/3.14159 )) )+1 ).^0.1);
end
I multiplied the whole formula by (-1) so the function is inverted so when I will be looking for the minimum of the inverted formula it will be actually the maximum of original one.
Then when I go to optimization tools and select the GA algorithm and define lower and upper bounds as -3 and 3 it shows me the result after about 60 iterations which is about 0.13 and the final point is something like [0, 9.34].
And how is this possible that the final point is not in the range defined by the bounds? And what is the actual maximum of this function?

The maximum is (0,0) (actually, when either input is 0, and periodically at multiples of pi). After you negate, you're looking for a minimum of a positive quantity. Just looking at the outer absolute value, it obviously can't get lower than 0. That trivially occurs when either value of sin(x) is 0.
Plugging in, you have f_min = f(0,0) = .0001(0 + 1)^0.1 = 1e-4
This expression is trivial to evaluate and plot over a 2d grid. Do that until you figure out what you're looking at, and what the approximate answer should be, and only then invoke an actual optimizer. GA does not sound like a good candidate for a relatively smooth expression like this. The reason you're getting strange answers is the fact that only one of the input parameters has to be 0. Once the optimizer finds one of those, the other input could be anything.

Related

MATLAB - negative values go to NaN in a symmetric function

Can someone please explain why the following symmetric function cannot pass a certain limit of negative values?
D = 0.1; l = 4;
c = #(x,v) (v/D).*exp(-v*x/D)./(1-exp(-v*l/D));
v_vec = -25:0.01:25;
figure(2)
hold on
plot(v_vec,c(l,v_vec),'b')
plot(v_vec,c(0,v_vec),'r')
Notice at the figure where the blue line chops, this is where I get inf/nan values.
It seems that Matlab is trying to compute a result that is too large, outputs +inf, and then operates on that, which yields +/- inf and NaNs.
For instance, at v=-25, part of the function computes exp(-(-25)*4/0.1), which is exp(1000), and that outputs +inf. (larger than the largest representable double precision float).
You can potentially solve that problem by rewriting your function to avoid operating of such very large (or very small) numbers, say by reorganising the fraction containing exp() functions.
I did encounter the same hurdle using exp() with arguments triggering overflow. Sometimes it is difficult to trace back numeric imprecision or convergence errors. In principle the function definition using exp() only create intermediate issues as your purpose as a transition function. The intention I guess was to provide a continuous function.
My solution to this problem is to divide the argument into regions and provide in each region an approximation function. In your case zero for negative x and proportional to x for positive x. In between you can use the orginal function. Care should be taken to match the approximation at the borders of the regions and the number of continuous differentiations which is important for convergence in loops.

min/max of a function without built in operators nor for/while/if?

We have to determine the min/max values of a 1x6000 array without using the operators mentionned in the title. We have to use algorithms learned in class, but I don't see how that would translate to matlab, as I will need to do a certain number of iterations before I get the right answer.
We learned the bisection method, muller method, newton method, fixed point method, etc.
Please don't write the code, as this is my homework, and I'm trying to learn something, but if you could guide me in the right direction...
Thank you.
Disclaimer
I'm a bit confused on why you can't use a for or while loop because Newton's method, the bisection method or any fixed point method is an iterative algorithm which requires loops. As such, this answer assumes that other than using these methods that are using loops, no other loops are allowed.
Since you're allowed to use Newton's method or the bisection method, remember that these methods find the root of a function or where the function output is equal to 0. Also remember that when you find the derivative and see where the values are equal to 0, this tells you what critical points (min and max) are part of the function. Therefore, if you ran Newton's or the bisection method on the derivative of your function, you would then be able to determine the critical points and hence where the min or max are. However, these don't tell if you the points are at a minimum or a maximum. Therefore, you'd have to take a look at the second derivative and examine the sign at these points. If the sign is positive, it is a minimum and if the sign is negative, it is a maximum.
Because you only have an array of points and not the actual function itself, have a look at diff which computes the discrete approximation to the derivative (i.e. a finite difference) for an array. Also when you are specifying points in the array that don't belong to an integer values as it will be inevitable with Newton's or the bisection, have a look at interp1 to interpolate and find the approximate value in between integer points.
I'll do only the case for Newton's method. You can derive a similar set of rules for the bisection method. Recall that Newton's method is defined such that:
(source: mit.edu)
x_i is the current guess of the root while x_{i+1} is the next guess of the root. f(x_i) is the function evaluated at x_i while f'(x_i) is the derivative of the function. Since you want to find the critical points where the derivative is equal to 0 and not the original function, f(x_i) is now f'(x_i) and its derivative is simply the second derivative or f''(x_i).
Given that X is the array of points you have, the pseudocode could look something like this:
Use dx = diff(X, 1) to compute the first derivative and dx2 = diff(X, 2) to compute the second derivative.
Specify an initial guess of the root, xp, of the derivative, or where you think the minimum or maximum point is. Starting at the halfway point where xp = floor(numel(X)/2); may be a good place to start.
Run Newton's method N amount of times or when the difference between successive guesses of the root is less than some threshold on diff(X, 1) using xp as the initial guess and updating xp to be used for future guesses. For each guess of the root, use interp1 to determine the approximate value of the derivative at this point. Therefore, the Newton's update rule would look something like:
i1 = interp1(1:numel(dx), dx, xp, 'linear');
i2 = interp1(1:numel(dx2), dx2, xp, 'linear');
xp = xp - (i1 / i2);
i1 is the derivative evaluated at the current guess xp while i2 is the second derivative evaluated at the current guess xp. We then do the update.
xp will now contain the location of where the critical point is. However, you will probably have floating point precision here so using this to index into your original array X to get the minimum or maximum is not valid. One thing that I can suggest as a heuristic would be to perhaps round xp which makes sense because for the decimal floating part of the final root to be < 0.5 would suggest that you lean towards the left of the value where as going >= 0.5 would suggest that you lean towards the right.
Once Newton's method converges, you need to check if the value is a minimum or maximum. Since you can't use if, a trick here would be to have your function output a two element array where both values are initialized to NaN. Depending on what the sign of the second derivative of this is at xp, you can manually create the right index to populate only one element of this array while leaving the other element as NaN. The position that is not NaN should tell you if it's a minimum or maximum. I'll stick with the convention where the first element is a minimum and the second element is a maximum. You can concretely determine this logic by:
minmax = [NaN NaN];
ind = interp1(1:numel(dx2), dx2, xp, 'linear') > 0 + 1;
minmax(ind) = X(round(xp));
The second statement calculates the index of where we need to populate the value in this array. It states that if the sign is positive, the value should be equal to 1 and if it isn't, the value should be equal to 0. Adding an offset of 1 will now mean that the index will either be 1 or 2 where 1 is the minimum value and 2 is the maximum.
What if you have more than one minimum or maximum?
It is possible where you could have more than one minimum or maximum that has the same value over the entire array, or you want to find both the minimum and maximum. In that case, you'd have to do this recursively where you first run Newton's method to find the minimum and maximum point. Then you'd have to split up your array into two halves where the first half is the first point in X up to this minimum/maximum point and the second half is the minimum/maximum point up to the end. You'd also have to make the initial guesses the middle of each array to ensure that you're not just going to return the maximum or minimum values possibly being located at the extremities of the array. If you want to achieve this without a loop, you will have to use recursion. If you think about it, this is almost the same as using merge sort. I wouldn't be surprised if you could use merge sort's logic in a way that would help you define the minimum and maximum values or specifically use recursion. However since you stated you can only used fixed-point methods, I didn't suggest to use recursion to begin with.
I've given some code hints, but as you said you don't want full code so this should hopefully get you started. Be advised that there may be some numerical inaccuracies due to diff and this is algorithm I haven't tried or tested. It's an idea that I thought of while reading your question.
Good luck!

Matlab double integral with heavily suppressed exponentials

I have recently been trying to calculate the double integral of the functionfun = #(v,x)(10^4)*0.648*(1+v*0.001).*( exp(-2.83./( 10^-8+(sqrt(1+2*v*0.001)).*(x.^2)) ) -1).*(exp(-(v.^2)*0.33)), in the range (-1000,1000) for v and (0,a) for x, where a is either a very large number or infinity. What I have found is that while in the case a = inf the value seems to be decently accurate (it reduces to a single integral which is less cumbersome to evaluate numerically), but if I integrate from 0 to a 10^9 and from 10^9 to infinity the integrals don't sum up to the correct value, with the latter one being zero. What I am really interested in is in the integral from 0 to 10^9, but these results make me wonder if I can trust it at all.
In what I have done, I also had to use a large prefactor (10^200) in front of the function to "compensate" for the small numbers; otherwise the results were all nonsense. I have tried to use vpa, but with no success. What am I doing wrong?
Rob
Looks like your problem has to do with the different methods Matlab uses for different cases and the big numbers you are handling.
We can see your function with ezsurf just to have an idea on how does it behave.
So hint 1 is that the value is going to be a negative value, lets integrate over small limits to see an approximation on how much will it be.
integral2(fun,-100,100,0,100)
%ans =
% -5.9050e+04
And assuming that the function tends to zero, we know the final value should be on the neighborhood.
Now hint 2:
integral2(fun,-1000,1000,0,100)
%ans =
% -2.5613e-29
This doesn't make much sense, by increasing the range of the limits the integral basically became zero. After checking the documentation of integral2
'Method' — Integration method
'auto' (default) | 'tiled' | 'iterated'
Integration method, specified as the comma-separated pair consisting of 'Method' and one of the methods described below.
Integration Method Description
'auto' For most cases, integral2 uses the 'tiled' method. It uses the 'iterated' method when any of the integration limits are infinite. This is the default method.
'tiled' integral2 transforms the region of integration to a rectangular shape and subdivides it into smaller rectangular regions as needed. The integration limits must be finite.
'iterated' integral2 calls integral to perform an iterated integral. The outer integral is evaluated over xmin ≤ x ≤ xmax. The inner integral is evaluated over ymin(x) ≤ y ≤ ymax(x). The integration limits can be infinite.
Ok, so if we don't define a method it will use "tiled" if the limits are finite, and "interpolated" if they are infinte.
Could it be that if the range is too big, the tiles created by the "tiled" method are too big to accurately calculate the integral? If that is the case then "iterated" should not have that problem, let's check
integral2(fun,-1000,1000,0,100,'Method','iterated')
%ans =
% -5.9050e+04
Interesting, looks like we are into something. Let's try the original problem
integral2(fun,-1000,1000,0,inf)
%ans =
% -5.9616e+04
integral2(fun,-1000,1000,0,10^9,'Method','tiled')
%ans =
% -2.1502e-33
integral2(fun,-1000,1000,0,10^9,'Method','iterated')
%ans =
% -5.9616e+04
integral2(fun,-1000,1000,10^9,inf)
%ans =
% 0
That looks better. So it looks like the 'tiled' method is the problem with your function because its characteristics and the size of the range of the limits. So as long as you use 'iterated' you should be ok.

Solving equations involving dozens of ceil and floor functions in MATLAB?

I am tackling a problem which uses lots of equations in the form of:
where q_i(x) is the only unknown, c_i, C_j, P_j are always positive. We have two cases, the first when c_i, C_j, P_j are integers and the case when they are real. C_j < P_j for all j
How is this type of problems efficently solved in MATLAB especially when the number of iterations N is between 20 - 100?
What I was doing is q_i(x) - c_i(x) must be equal to the summation of integers. So i was doing an exhaustive search for q_i(x) which satisfies both ends of the equation. Clearly this is computationally exhaustive.
What if c_i(x) is a floating point number, this will even make the problem even more difficult to find a real q_i(x)?
MORE INFO: These equations are from the paper "Integrating Preemption Threshold to Fixed Priority DVS Scheduling Algorithms" by Yang and Lin.
Thanks
You can use bisection method to numerically find zeros of almost any well-behavior functions.
Convert your equation problem into a zero-finding problem, by moving all things to one side of the equal sign. Then find x: f(x)=0.
Apply bisection method equation solver.
That's it! Or may be....
If you have specific range(s) where the roots should fall in, then just perform bisection method for each range. If not, you still have to give a maximum estimation (you don't want to try some number larger than that), and make this as the range.
The problem of this method is for each given range it can only find one root, because it's always picking the left (or right) half of the range. That's OK if P_j is integer, as you can always find a minimum step of the function. Say P_j = 1, then only a change in q_i larger than 1 leads to another segment (and thus a possible different root). Otherwise, within each range shorter than 1 there will be at most one solution.
If P_j is an arbitrary number (such as 1e-10), unless you have a lower limit on P_j, most likely you are out of lucky, since you can't tell how fast the function will jump, which essentially means f(x) is not a well-behavior function, making it hard to solve.
The sum is a step function. You can discretize the problem by calculating where the floor function jumps for the next value; this is periodic for every j. Then you overlay the N ''rhythms'' (each has its own speed specified by the Pj) and get all the locations where the sum jumps. Each segment can have exactly 0 or 1 intersection with qi(x). You should visualize the problem for intuitive understanding like this:
f = #(q) 2 + (floor(q/3)*0.5 + floor(q/4)*3 + floor(q/2)*.3);
xx = -10:0.01:10;
plot(xx,f(xx),xx,xx)
For each step, it can be checked analytically if an intersection exists or not.
jumps = unique([0:3:10,0:4:10,0:2:10]); % Vector with position of jumps
lBounds = jumps(1:end-1); % Vector with lower bounds of stairs
uBounds = jumps(2:end); % Vector with upper bounds of stairs
middle = (lBounds+uBounds)/2; % center of each stair
fStep = f(middle); % height of the stairs
intersection = fStep; % Solution of linear function q=fStep
% Check if intersection is within the bounds of the specific step
solutions = intersection(intersection>=lBounds & intersection<uBounds)
2.3000 6.9000

How to find minimum of nonlinear, multivariate function using Newton's method (code not linear algebra)

I'm trying to do some parameter estimation and want to choose parameter estimates that minimize the square error in a predicted equation over about 30 variables. If the equation were linear, I would just compute the 30 partial derivatives, set them all to zero, and use a linear-equation solver. But unfortunately the equation is nonlinear and so are its derivatives.
If the equation were over a single variable, I would just use Newton's method (also known as Newton-Raphson). The Web is rich in examples and code to implement Newton's method for functions of a single variable.
Given that I have about 30 variables, how can I program a numeric solution to this problem using Newton's method? I have the equation in closed form and can compute the first and second derivatives, but I don't know quite how to proceed from there. I have found a large number of treatments on the web, but they quickly get into heavy matrix notation. I've found something moderately helpful on Wikipedia, but I'm having trouble translating it into code.
Where I'm worried about breaking down is in the matrix algebra and matrix inversions. I can invert a matrix with a linear-equation solver but I'm worried about getting the right rows and columns, avoiding transposition errors, and so on.
To be quite concrete:
I want to work with tables mapping variables to their values. I can write a function of such a table that returns the square error given such a table as argument. I can also create functions that return a partial derivative with respect to any given variable.
I have a reasonable starting estimate for the values in the table, so I'm not worried about convergence.
I'm not sure how to write the loop that uses an estimate (table of value for each variable), the function, and a table of partial-derivative functions to produce a new estimate.
That last is what I'd like help with. Any direct help or pointers to good sources will be warmly appreciated.
Edit: Since I have the first and second derivatives in closed form, I would like to take advantage of them and avoid more slowly converging methods like simplex searches.
The Numerical Recipes link was most helpful. I wound up symbolically differentiating my error estimate to produce 30 partial derivatives, then used Newton's method to set them all to zero. Here are the highlights of the code:
__doc.findzero = [[function(functions, partials, point, [epsilon, steps]) returns table, boolean
Where
point is a table mapping variable names to real numbers
(a point in N-dimensional space)
functions is a list of functions, each of which takes a table like
point as an argument
partials is a list of tables; partials[i].x is the partial derivative
of functions[i] with respect to 'x'
epilson is a number that says how close to zero we're trying to get
steps is max number of steps to take (defaults to infinity)
result is a table like 'point', boolean that says 'converged'
]]
-- See Numerical Recipes in C, Section 9.6 [http://www.nrbook.com/a/bookcpdf.php]
function findzero(functions, partials, point, epsilon, steps)
epsilon = epsilon or 1.0e-6
steps = steps or 1/0
assert(#functions > 0)
assert(table.numpairs(partials[1]) == #functions,
'number of functions not equal to number of variables')
local equations = { }
repeat
if Linf(functions, point) <= epsilon then
return point, true
end
for i = 1, #functions do
local F = functions[i](point)
local zero = F
for x, partial in pairs(partials[i]) do
zero = zero + lineq.var(x) * partial(point)
end
equations[i] = lineq.eqn(zero, 0)
end
local delta = table.map(lineq.tonumber, lineq.solve(equations, {}).answers)
point = table.map(function(v, x) return v + delta[x] end, point)
steps = steps - 1
until steps <= 0
return point, false
end
function Linf(functions, point)
-- distance using L-infinity norm
assert(#functions > 0)
local max = 0
for i = 1, #functions do
local z = functions[i](point)
max = math.max(max, math.abs(z))
end
return max
end
You might be able to find what you need at the Numerical Recipes in C web page. There is a free version available online. Here (PDF) is the chapter containing the Newton-Raphson method implemented in C. You may also want to look at what is available at Netlib (LINPack, et. al.).
As an alternative to using Newton's method the Simplex Method of Nelder-Mead is ideally suited to this problem and referenced in Numerical Recpies in C.
Rob
You are asking for a function minimization algorithm. There are two main classes: local and global. Your problem is least squares so both local and global minimization algorithms should converge to the same unique solution. Local minimization is far more efficient than global so select that.
There are many local minimization algorithms but one particularly well suited to least squares problems is Levenberg-Marquardt. If you don't have such a solver to hand (e.g. from MINPACK) then you can probably get away with Newton's method:
x <- x - (hessian x)^-1 * grad x
where you compute the inverse matrix multiplied by a vector using a linear solver.
Since you already have the partial derivatives, how about a general gradient-descent approach?
Maybe you think you have a good-enough solution, but for me, the easiest way to think about this is to understand it in the 1-variable case first, and then extend it to the matrix case.
In the 1-variable case, if you divide the first derivative by the second derivative, you get the (negative) step size to your next trial point, e.g. -V/A.
In the N-variable case, the first derivative is a vector and the second derivative is a matrix (the Hessian). You multiply the derivative vector by the inverse of the second derivative, and the result is the negative step-vector to your next trial point, e.g. -V*(1/A)
I assume you can get the 2nd-derivative Hessian matrix. You will need a routine to invert it. There are plenty of these around in various linear algebra packages, and they are quite fast.
(For readers who are not familiar with this idea, suppose the two variables are x and y, and the surface is v(x,y). Then the first derivative is the vector:
V = [ dv/dx, dv/dy ]
and the second derivative is the matrix:
A = [dV/dx]
[dV/dy]
or:
A = [ d(dv/dx)/dx, d(dv/dy)/dx]
[ d(dv/dx)/dy, d(dv/dy)/dy]
or:
A = [d^2v/dx^2, d^2v/dydx]
[d^2v/dxdy, d^2v/dy^2]
which is symmetric.)
If the surface is parabolic (constant 2nd derivative) it will get to the answer in 1 step. On the other hand, if the 2nd derivative is very not-constant, you could encounter oscillation. Cutting each step in half (or some fraction) should make it stable.
If N == 1, you'll see that it does the same thing as in the 1-variable case.
Good luck.
Added: You wanted code:
double X[N];
// Set X to initial estimate
while(!done){
double V[N]; // 1st derivative "velocity" vector
double A[N*N]; // 2nd derivative "acceleration" matrix
double A1[N*N]; // inverse of A
double S[N]; // step vector
CalculateFirstDerivative(V, X);
CalculateSecondDerivative(A, X);
// A1 = 1/A
GetMatrixInverse(A, A1);
// S = V*(1/A)
VectorTimesMatrix(V, A1, S);
// if S is small enough, stop
// X -= S
VectorMinusVector(X, S, X);
}
My opinion is to use a stochastic optimizer, e.g., a Particle Swarm method.