Positive directional derivative for linesearch - scipy

What does the smode of scipy.optimize 'Positive directional derivative for linesearch' mean?
for example in fmin_slsqp
http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.fmin_slsqp.html

These optimization algorithms typically work by choosing a descent direction, and then performing a line search to that direction. I think this message means that the optimizer got into a position where it did not manage to find a direction where the value of the objective function decreases (fast enough), but could also not verify that the current position is a minimum.

I still don't know what it means but how to solve it. Basically, the function that is optimized needs to return a smaller value.
F(x):
...
return value / 10000000

To avoid changing your function you can also try experimenting with the ftol and eps parameters. Changing ftol to a higher value is equivalent to changing the function to a smaller value.

One situation in which you receive this error, is when
x0 is outside the valid range you defined in bounds.
and the unconstrained maximum is attained for values outside bounds.
I will set up a hypothetical optimization problem, run it with two different initial values and print the output of scipy.optimize:
import numpy as np
from scipy import optimize
H = np.array([[2., 0.],
[0., 8.]])
c = np.array([0, -32])
x0 = np.array([0.5, 0.5]) # valid initial value
x1 = np.array([-1, 1.1]) # invalid initial value
def loss(x, sign=1.):
return sign * (0.5 * np.dot(x.T, np.dot(H, x)) + np.dot(c, x))
def jac(x, sign=1.):
return sign * (np.dot(x.T, H) + c)
bounds = [(0, 1), (0, 1)]
Now that loss function, gradient, x0 and bounds are in place, we can solve the problem:
def solve(start):
res = optimize.minimize(fun=loss,
x0=start,
jac=jac,
bounds=bounds,
method='SLSQP')
return res
solve(x0) # valid initial value
# fun: -27.999999999963507
# jac: array([ 2.90878432e-14, -2.40000000e+01])
# message: 'Optimization terminated successfully.'
# ...
# status: 0
# success: True
# x: array([1.45439216e-14, 1.00000000e+00])
solve(x1) # invalid initial value:
# fun: -29.534653465326528
# jac: array([ -1.16831683, -23.36633663])
# message: 'Positive directional derivative for linesearch'
# ...
# status: 8
# success: False
# x: array([-0.58415842, 1.07920792])
As #pv. pointed out in the accepted answer, the algorithm can't verify that this is a minimum:
I think this message means that the optimizer got into a position where it did not manage to find a direction where the value of the objective function decreases (fast enough), but could also not verify that the current position is a minimum.

It's not a complete answer, but you can see the source code that generates the smode here:
https://github.com/scipy/scipy/blob/master/scipy/optimize/slsqp/slsqp_optmz.f
Assignments of mode = 8 (the "Positive directional derivative for linesearch" you are asking about) can be found in lines 412 and 486. If can figure out why they are assigned in the code, you've got your answer.

Related

scipy integrate.quad return an incorrect value

i use scipy integrate.quad to calc cdf of normal distribution:
def nor(delta, mu, x):
return 1 / (math.sqrt(2 * math.pi) * delta) * np.exp(-np.square(x - mu) / (2 * np.square(delta)))
delta = 0.1
mu = 0
t = np.arange(4.0, 10.0, 1)
nor_int = lambda t: integrate.quad(lambda x: nor(delta, mu, x), -np.inf, t)
nor_int_vec = np.vectorize(nor_int)
s = nor_int_vec(t)
for i in zip(s[0],s[1]):
print i
while it print as follows:
(1.0000000000000002, 1.2506543424265854e-08)
(1.9563704110140217e-11, 3.5403445591955275e-11)
(1.0000000000001916, 1.2616577562700088e-08)
(1.0842532749783998e-34, 1.9621183122960244e-34)
(4.234531567162006e-09, 7.753407284370446e-09)
(1.0000000000001334, 1.757986959115912e-10)
for some x, it return a value approximate to zero, it should be return 1.
can somebody tell me what is wrong?
Same reason as in why does quad return both zeros when integrating a simple Gaussian pdf at a very small variance? but seeing as I can't mark it as a duplicate, here goes:
You are integrating a function with tight localization (at scale delta) over a very large (in fact infinite) interval. The integration routine can simply miss the part of the interval where the function is substantially different from 0, judging it to be 0 instead. Some guidance is required. The parameter points can be used to this effect (see the linked question) but since quad over an infinite interval does not support it, the interval has to be manually split, like so:
for t in range(4, 10):
int1 = integrate.quad(lambda x: nor(delta, mu, x), -np.inf, mu - 10*delta)[0]
int2 = integrate.quad(lambda x: nor(delta, mu, x), mu - 10*delta, t)[0]
print(int1 + int2)
This prints 1 or nearly 1 every time. I picked mu-10*delta as a point to split on, figuring most of the function lies to the right of it, no matter what mu and delta are.
Notes:
Use np.sqrt etc; there is usually no reason for put math functions in NumPy code. The NumPy versions are available and are vectorized.
Applying np.vectorize to quad is not doing anything besides making the code longer and slightly harder to read. Use a normal Python loop or list comprehension. See NumPy vectorization with integration

python scipy odeint return values

I have been playing around with odeint in scipy and I could not understand what the function returns as return values. For example,
# -*- coding: utf-8 -*-
"""
Created on Sat Feb 04 20:01:16 2017
#author: Esash
"""
from scipy.integrate import odeint
import matplotlib.pyplot as plt
import numpy as np
def MassSpring(state,t):
# unpack the state vector
x = state[0]
xd = state[1]
# these are our constants
k = -5.5 # Newtons per metre
m = 1.5 # Kilograms
g = 9.8 # metres per second
# compute acceleration xdd
xdd = ((k*x)/m) + g
# return the two state derivatives
return [xd, xdd]
state0 = [0.0, 0.0]
t = np.arange(0.0, 10.0, 0.1)
state = odeint(MassSpring, state0, t)
plt.plot(t, state)
plt.xlabel('TIME (sec)')
plt.ylabel('STATES')
plt.title('Mass-Spring System')
plt.legend(('$x$ (m)', '$\dot{x}$ (m/sec)'))
In the above code, I have set the two parameters as 0.0 and 0.0 and the xd in the function is just 0.0 which I return as well. But the return value is not just 0.0, it varies.
In [14]: state
Out[14]:
array([[ 0. , 0. ],
[ 0.04885046, 0.97402207],
[ 0.19361613, 1.91243899],
...,
[ 0.10076832, -1.39206172],
[ 0.00941998, -0.42931942],
[ 0.01542821, 0.54911655]])
Also, if I have one differential equation for which I need to send many parameters, then I cannot send M parameters in the odeint call as a list or tuple and return only the solution of the ODE as a single array. It expects that the number of parameters sent should be equal to the number of parameters returned form the function. Why is this?
I am not able to understand how this function works. Can someone please explain this to me? My apologies if I sound too confusing.
Thanks a lot.
I could not understand what the function returns as return values.
The return value of odeint is the computed solution at the requested time values. That is, after this call
state = odeint(MassSpring, state0, t)
state[0] is [x(t[0]), x'(t[0])], state[1] is [x(t[1]), x'(t[1])], etc. If you wanted to plot just the x coordinate, you could call plt.plot(t, state[:, 0]) to plot the first column of state.
I have set the two parameters as 0.0 and 0.0 [...]
What you are calling the "parameters" are usually called the initial conditions. They are the values of x(t) and x'(t) at t=0.
But the return value is not just 0.0, it varies.
That is because (0, 0) is not an equilibrium of the system. Look at the equation
xdd = ((k*x)/m) + g
When x is 0, you get xdd = g, so xdd is initially positive. That is, there is a nonzero force (gravity) acting on the mass, so it accelerates.
The equilibrium state is [-g*m/k, 0].
Also, if I have one differential equation for which I need to send many parameters, then I cannot send M parameters in the odeint call as a list or tuple and return only the solution of the ODE as a single array. It expects that the number of parameters sent should be equal to the number of parameters returned form the function. Why is this?
odeint only solves the system for one set of initial conditions at a time. If you want to generate several solutions (corresponding to different initial conditions), you'll have to call odeint multiple times.

angle() of a real number

I'm a bit confused about the angle() function in Matlab, in particular when applied to an array of real numbers.
The angle() function should give me the phase of a complex number. Example: y = a + bi, ==> phase = arctan(b/a). Indeed, the following works:
for t=1:1000
comp(t) = exp(1i*(t/10));
end
phase_good_comp1 = unwrap(angle(comp)); %this gives me the right answer
b = imag(comp);
a = real(comp);
phase_good_comp2 = atan(b./a); %this gives me the right answer too, but
wrapped (not sure if there is a way to unwrap this, but unwrap() does not
work)
figure(1)
plot(phase_good_comp1)
hold on
plot(phase_good_comp2,'--r')
legend('good phase1', 'good phase2')
title('complex number')
Here's the plot for the complex numbers --
Note that I can use either the angle() function, or the explicit definition of phase, as I have shown above. Both yield good results (I can't unwrap the latter, but that's not my issue).
Now if I apply the same logic to an array of real numbers, I should get a constant phase everywhere, since no imaginary part exists, so arctan(b/a) = arctan(0) = 0. This works if I use the explicit definition of phase, but I get a weird result if I use angle():
for t=1:1000
ree(t) = cos((t/10));
end
phase_bad_re = unwrap(angle(ree)); %this gives me an unreasonable (?) answer
b = imag(ree);
a = real(ree);
phase_good_re = atan(b./a); %this gives me the right answer
figure(1)
plot(phase_bad_re)
hold on
plot(phase_good_re,'--r')
legend('bad phase', 'good phase')
title('real number')
Here's the plot for the real numbers --
Why the oscillation when I use angle()???
The Matlab documentation tells you how to compute this:
The angle function can be expressed as angle(z) = imag(log(z)) = atan2(imag(z),real(z)).
https://www.mathworks.com/help/matlab/ref/angle.html
Note that they define it with atan2 instead of atan.
Now your data is in the range of cosine, which includes both positive and negative numbers. The angle on the positive numbers should be 0 and the angle on the negative numbers should be an odd-integer multiple of pi in general. Using the specific definition that they've chosen to get a unique answer, it is pi. That's what you got. (Actually, for the positive numbers, any even-integer multiple of pi will do, but 0 is the "natural" choice and the one that you get from atan2.)
If you're not clear why the negative numbers don't have angle = 0, plot it out in the complex plane and keep in mind that the radial part of the complex number is positive by definition. That is z = r * exp(i*theta) for positive r and theta given by this angle you're computing.
Since sign of cosine function is periodically changed, angle() is also oscillated.
Please, try this.
a=angle(1);
b=angle(-1);
Phase of 1+i*0 is 0, while phase of -1+i*0 is 3.14.
But, in case of atan, b/a is always 0, so that the result of atan() is all 0.

Using basinhopping to find the global minimum of a not-so-well hehaved function

I have a function of type R * R-> R defined as
f(x,y)=(y-5)^2 if x = 9
(x-9)^2 otherwise
This function is smooth except near x = 9. It can also be observed that this function is non-negative and
f(x,y) == 0 <==> (x,y) = (9,5).
Thus, (9,5) is the only global minimum of f. But I find it very hard to locate this point (or approximate it) using scipy’s global optimization tool. I have tried with spicy.optimize.basinhopping:
import scipy.optimize as op
def f(X):
if X[0]==9:
return (X[1]-5)**2
else:
return (X[0]-9)**2
res=op.basinhopping(f,[7.0,7.0],minimizer_kwargs={'method':'powell'},niter=100,stepsize=50)
print res
And I get
fun: 3.1554436208840472e-30
nfev: 5806
message: ['requested number of basinhopping iterations completed successfully']
nit: 100
x: array([ 9. , 20.01509711])
Is there a way to set properly basinhopping so the true global minimum (9,5) can be found?
This is an extremely hard problem for any numerical optimization algorithm. The minimizer will move towards x=9, but the convergence criterion will be reached long before 9 is actually reached. It would have to get to exactly 9 before it knows anything about the y axis. That can only happen through luck or pre-existing knowledge about the problem. Even a brute force optimization technique like a grid search could only solve it through luck.
l-bfgs-b stops after 2 iterations thinking it found the solution at [9.00000017, 7.] with the function value 2.773339593234183e-14 and the gradient [ 3.43066965e-07, 0.00000000e+00]

Octave fminsearch: Problems with minimization and options

I am trying to use Octave's fminsearch function, which I have used in MATLAB before. The function seems not sufficiently documented (for me at least), and I have no idea how to set to options such that it would actually minimize.
I tried fitting a very simple exponential function using the code at the end of this message. I want the following:
I want the function to take as input the x- and y-values, just like MATLAB would do. Furthermore, I want some control over the options, to make sure that it actually minimizes (i.e. to a minimum!).
Of course, in the end I want to fit functions that are more complicated than exponential, but I want to be able to fit exponentials at least.
I have several problems with fminsearch:
I tried handing over the x- and y-value to the function, but a matlab-style thing like this:
[xx,fval]=fminsearch(#exponential,[1000 1],x,y);
or
[xx,fval]=fminsearch(#exponential,[33000 1],options,x,y)
produces errors:
error: options(6) does not correspond to known algorithm
error: called from:
error: /opt/local/share/octave/packages/optim-1.0.6/fmins.m at line 72, column 16
error: /opt/local/share/octave/packages/optim-1.0.6/fminsearch.m at line 29, column 4
Or, respectively (for the second case above):
error: `x' undefined near line 4 column 3
error: called from:
error: /Users/paul/exponential.m at line 4, column 2
error: /opt/local/share/octave/packages/optim-1.0.6/nmsmax.m at line 63, column 6
error: /opt/local/share/octave/packages/optim-1.0.6/fmins.m at line 77, column 9
error: /opt/local/share/octave/packages/optim-1.0.6/fminsearch.m at line 29, column 4
Apparently, the order of arguments that fminsearch takes is different from the one in MATLAB. So, how is this order??
How can I make fminsearch take values and options?
I found a workaround to the problem that the function would not take values: I defined the x- and y values as global. Not elegant, but at least then the values are available in the function.
Nonetheless, fminsearch does not minimize properly.
This is shown below:
Here is the function:
function f=exponential(coeff)
global x
global y
X=x;
Y=y;
a= coeff(1);
b= coeff(2);
Y_fun = a .* exp(-X.*b);
DIFF = Y_fun - Y;
SQ_DIFF = DIFF.^2;
f=sum(SQ_DIFF);
end
Here is the code:
global x
global y
x=[0:1:200];
y=4930*exp(-0.0454*x);
options(10)=10000000;
[cc,fval]=fminsearch(#exponential,[5000 0.01])
This is the output:
cc =
4930.0 5184.6
fval = 2.5571e+08
Why does fminsearch not find the solution?
There is an fminsearch implementation in the octave-forge package "optim".
You can see in its implementation file that the third parameter is always an options vector, the fourth is always a grad vector, so your ,x,y invocations will not work.
You can also see in the implementation that it calls an fmins implementation.
The documentation of that fmins implementation states:
if options(6)==0 && options(5)==0 - regular simplex
if options(6)==0 && options(5)==1 - right-angled simplex
Comment: the default is set to "right-angled simplex".
this works better for me on a broad range of problems,
although the default in nmsmax is "regular simplex"
A recent problem of mine would solve fine with matlab's fminsearch, but not with this octave-forge implementation. I had to specify an options vector [0 1e-3 0 0 0 0] to have it use a regular simplex instead of a 'right-angled simplex'. The octave default makes no sense if your coefficients differ vastly in scale.
The optimization function fminsearch will always try to find a minimum, no matter what the options are. So if you are finding it's not finding a minimum, it's because it failed to do so.
From the code you provide, I cannot determine what goes wrong. The solution with the globals should work, and indeed does work over here, so something else on your side must be going awry. (NOTE: I do use MATLAB, not Octave, so those two functions could be slightly different...)
Anyway, why not do it like this?
function f = exponential(coeff)
x = 0:1:200;
y = 4930*exp(-0.0454*x);
a = coeff(1);
b = coeff(2);
Y_fun = a .* exp(-x.*b);
f = sum((Y_fun-y).^2);
end
Or, if you must pass x and y as external parameters,
x = [0:1:200];
y = 4930*exp(-0.0454*x);
[cc,fval] = fminsearch(#(c)exponential(c,x,y),[5000 0.01])
function f = exponential(coeff,x,y)
a = coeff(1);
b = coeff(2);
Y_fun = a .* exp(-x.*b);
f = sum((Y_fun-y).^2);
end