scipy minimization: How to code a jacobian/hessian for objective function using max value - scipy

I'm using scipy.optimize.minimize with the Newton-CG (Newton Conjugate Gradient) method since I have an objective function for which I know the analytical Jacobian and Hessian. However, I need to add a regularization term R=exp(max(s)) based on the maximum value inside the array parameter "s" that being fit. It isn't entirely obvious to me how to implement derivatives for R. Letting the minimization algorithm do numeric derivatives for the whole objective function isn't an option, by the way, because it is far too complex. Any thoughts, oh wise people of the web?

Related

Nonlinear curve fitting of a matrix function in python

I have the following problem. I have a N x N real matrix called Z(x; t), where x and t might be vectors in general. I have N_s observations (x_k, Z_k), k=1,..., N_s and I'd like to find the vector of parameters t that better approximates the data in the least square sense, which means I want t that minimizes
S(t) = \sum_{k=1}^{N_s} \sum_{i=1}^{N} \sum_{j=1}^N (Z_{k, i j} - Z(x_k; t))^2
This is in general a non-linear fitting of a matrix function. I'm only finding examples in which one has to fit scalar functions which are not immediately generalizable to a matrix function (nor a vector function). I tried using the scipy.optimize.leastsq function, the package symfit and lmfit, but still I don't manage to find a solution. Eventually, I'm ending up writing my own code...any help is appreciated!
You can do curve-fitting with multi-dimensional data. As far as I am aware, none of the low-level algorithms explicitly support multidimensional data, but they do minimize a one-dimensional array in the least-squares sense. And the fitting methods do not really care about the "independent variable(s)" x except in that they help you calculate the array to be minimized - perhaps to calculate a model function to match to y data.
That is to say: if you can write a function that would take the parameter values and calculate the matrix to be minimized, just flatten that 2-d (on n-d) array to one dimension. The fit will not mind.

The right package/software for non-linear optimization with semidefinite constraints

I am struggling to solve an optimization problem, numerically, of the following (generic) form.
minimize F(x)
such that:
___(1): 0 < x < 1
___(2): M(x) >= 0.
where M(x) is a matrix whose elements are quadratic functions of x. The last constraint means that M(x) must be a positive semidefinite matrix. Furthermore F(x) is a callable function. For the more curious, here is a similar minimum-working-example.
I have tried a few options, but to no success.
PICOS, CVXPY and CVX -- In the first two cases, I cannot find a way of encoding a minimax problem such as mine. In the third one which is implemented in MATLAB, the matrices involved in a semidefinite constraint must be affine. So my problem falls outside this criteria.
fmincon -- How can we encode a matrix positivity constraint? One way is to compute the eigenvalues of the matrix M(x) analytically, and constraint each one to be positive. But the analytic expression for the eigenvalues can be horrendous.
MOSEK -- The objective function must be a expressible in a standard form. I cannot find an example of a user-defined objective function.
scipy.optimize -- Along with the objective functions and the constraints, it is necessary to provide the derivative of these functions as well. Particularly in my case, that is fine for the objective function. But, if I were to express the matrix positivity constraint (as well as it's derivative) with an analytic expression of the eigenvalues, that can be very tedious.
My apologies for not providing a MWE to illustrate my attempts with each of the above packages/softwares.
Can anyone please suggest a package/software which could be useful to me in solving my optimization problem?
Have a look at a nonlinear optimization package with box constraints, where different type of constraints may be coded via penalty or barrier techniques.
Look at the following URL
merlin.cs.uoi.gr

2-Dimensional Minimization without Derivatives and Ignoring certain Input Parameters on the go

I have a Function V which depends on two variables v1 and v2 and a parameter-Array p containing 15 Parameters.
I want to Minimize my Function V regarding v1 and v2, but there is no closed expression for my Function, so I can't build and use the Derivatives.
The Problem is the following : For caluclating the Value of my Function I need the Eigenvalues of two 4x4 Matrices (which should be symmetric and real by concept, but sometimes the EigenSolver does not get real Eigenvalues). These Eigenvalues I calculate with the Eigen Package. The entries of the Matrices are given by v1,v2 and p.
There are certain Input Sets for which some of these Eigenvalues become negative. These are Input Sets which I want to ignore for my calculation as they will lead to an complex Function value and my Function is only allowed to have real values.
Is there a way to include this? My first attempt was a Nelder-Mead-Simplex Algorithm using the GSL-Library and an way too high Output value for the Function if one of the Eigenvalues becomes negative, but this doesn't work.
Thanks for any suggestions.
For the Nelder-Mead simplex, you could reject new points as vertices for the simplex, unless they have the desired properties.
Your method to artificially increase the function value for forbidden points is also called penalty or barrier function. You might want to re-design your penalty function.
Another optimization method without derivatives is the Simulated Annealing method. Again, you could modify the method to avoid forbidden points.
What do you mean by "doesn't work"? Does it take too long? Are the resulting function values too high?
Depending on the function evaluation cost, it might be an approach to simply scan a 2D interval, evaluate all width x height function values and drill down in the tile with the lowest function values.

Minimizing error of a formula in MATLAB (Least squares?)

I'm not too familiar with MATLAB or computational mathematics so I was wondering how I might solve an equation involving the sum of squares, where each term involves two vectors- one known and one unknown. This formula is supposed to represent the error and I need to minimize the error. I think I'm supposed to use least squares but I don't know too much about it and I'm wondering what function is best for doing that and what arguments would represent my equation. My teacher also mentioned something about taking derivatives and he formed a matrix using derivatives which confused me even more- am I required to take derivatives?
The problem that you must be trying to solve is
Min u'u = min \sum_i u_i^2, u=y-Xbeta, where u is the error, y is the vector of dependent variables you are trying to explain, X is a matrix of independent variables and beta is the vector you want to estimate.
Since sum u_i^2 is diferentiable (and convex), you can evaluate the minimal of this expression calculating its derivative and making it equal to zero.
If you do that, you find that beta=inv(X'X)X'y. This maybe calculated using the matlab function regress http://www.mathworks.com/help/stats/regress.html or writing this formula in Matlab. However, you should be careful how to evaluate the inverse (X'X) see Most efficient matrix inversion in MATLAB

Fminunc returns indefinite Hessian matrix for a convex objective

In minimizing a convex objective function, does it mean that the Hessian matrix at minimizer should be PSD? If fminunc in Matlab returns a hessian which is not psd what does it mean? am I using a wrong objective?
I do that in environments other than matlab.
Non-PSD means you can't take the Cholesky transform of it (i.e. the matrix square-root), so you can't use it to get standard errors, for example.
To get a good hessian, your objective function has to be really smooth, because you're taking a second derivative, which doubly amplifies any noise. If possible, it is best to use analytic derivatives rather than finite-difference. That is, if you really need the hessian.