Define initial parameters of a nonlinear fit with no information - matlab

I was wondering if there exists a technical way to choose initial parameters to these kind of problems (as they can take virtually any form). My question arises from the fact that my solution depends a little on initial parameters (as usual). My fit consists of 10 parameters and approximately 5120 data points (x,y,z) and has non linear constraints. I have been doing this by brute force, that is, trying parameters randomly and trying to observe a pattern but it led me nowhere.
I also have tried using MATLAB's Genetic Algorithm (to find a global optimum) but with no success as it seems my function has a ton of local minima.
For the purpose of my problem, I need justfy in some manner the reasons behind choosing initial parameters.

Without any insight on the model and likely values of the parameters, the search space is too large for anything feasible. Think that just trying ten values for every parameter corresponds to ten billion combinations.
There is no magical black box.

You can try Bayesian Optimization to find a global optimum for expensive black box functions. Matlab describes it's implementation [bayesopt][2] as
Select optimal machine learning hyperparameters using Bayesian optimization
but you can use it to optimize any function. Bayesian Optimization works by updating a prior belief over a distribution of functions with the observed data.
To speed up the optimization I would recommend adding your existing data via the InitialX and InitialObjective input arguments.

Related

How to ensure my optimization algorithm has found the solution?

I am performing a numerical optimization where I try to find the parameters of a statistical model that best match certain moments of the data. I have 6 parameters in total I need to find. I have written a matlab function which takes the parameters as input and gives the sum of squared deviations from the empirical moments as output. I use the fminsearch function to find the parameters and it gives me a solution.
However, I am unsure if this is really a global minimum. What type of checks I could do to ensure the numerical solution is correct? Plotting the function is challenging due to high dimensionality. Any general advice in solving this type of problem is also appreciated.
You are describing the difficulties of a global optimization problem.
As mentioned in one of the comments, fminsearch() and related function fminunc() will return a local minimum. It provides no guarantee that you will get a global minimum.
A simple way to check if the answer you get really is a global minimum, would be to run the function multiple times from various starting points. If the answer all converges to the same value, it might be a global minimum. If you find an answer with lower error values, then the last answer was not the global minimum.
The only way to be perfectly sure that you have the global minima, is to know whether or not your function is convex (i.e. your function has only a single minima.) This will have to be done analytically.
If it is not possible to be done analytically, there are many global optimization methods you may want to consider, including some available as this MATLAB toolbox.

Patternsearch discrete variables

I want to optimize a multi-variable function with the patternsearch function in MATLAB. The function requires a lower and upper boundary and looks within the boundaries in a continuous domain.
I however have a discrete set of values in an excel file and would like the algorithm to search within this discrete domain instead of in the continuous domain.
Is this possible with patternsearch?
Maybe I don't understand correctly your question but if you have a (discret and finite) set of values, why don't you compute the function's value at these points and return the minium?
In short, no. That is not what patternsearch is intended for. Optimization techniques for discrete and continuous search spaces are quite expectedly different.
If you're looking for an approximate answer however, it is possible to use spline, polyfit, etc. to arrive at an approximate continuous function for your data and then apply patternsearch on it.
If you provide greater detail about your problem, I or someone else may be able to suggest a more suitable way of working with your data.
The best optimization tool for this is the Genetic Algorithm. This optimization tool comes with Matlab's global optimization toolbox and allows for optimization of both continuous and discrete variables at the same time.
In the genetic algorithm variables that are integers have to be declared as such. Non-declared variables are continuous by default.
Check the Global Optimization Toolbox guide for information on how it works: http://it.mathworks.com/help/pdf_doc/gads/gads_tb.pdf.

Why does GlobalSearch return different solutions each run?

When running the GlobalSearch solver on a nonlinear constrained optimization problem I have, I often get very different solutions each run. For the cases that I have an analytical solution, the numerical results are less dispersed than the non-analytical cases but are still different each run. It would be nice to get the same results at least for these analytical cases so that I know the optimization routine is working properly. Is there a good explanation of this in the Global Optimization Toolbox User Guide that I missed?
Also, why does GlobalSearch use a different number of local solver runs each run?
Thanks!
A full description of how the GlobalSearch algorithm works can be found Here.
In summary the GlobalSearch method iteratively performs convex optimization. Basically it starts out by using fmincon to search for a local minimum near the initial conditions you have provided. Then a bunch of "trial points", based on how good the initial result was, are generated using the "scatter search algorithm." Then there is some more convex optimization and rating of "how good" the minima around these points are.
There are a couple of things that can cause the algorithm give you different answers:
1. Changing the initial conditions you give it
2. The scatter search algorithm itself
The fact that you are getting different answers each time likely means that your function is highly non-convex. The best thing that I know of that you can do in this scenario is just to try the optimization algorithm at several different initial conditions and see what result you get back the most frequently.
It also looks like there is something called the 'PlotFcns' property which would allow you get a better idea what the functions the solver is generating for you look like.
You can use the ga or gamulti objective functions within the GlobalSearch api. I would recommend this. Convex optimizers wont be able to solve a non-linear problem. Even then Genetic Algorithms dont gaurantee the solution. If you run the ga and then use its final minimum as the start of your fmincon search then it should result in the same answer consistently. There may be better ones but if the search space is unknown you may never know.

Alternatives to FMINCON

Are there any faster and more efficient solvers other than fmincon? I'm using fmincon for a specific problem and I run out of memory for modest sized vector variable. I don't have any supercomputers or cloud computing options at my disposal, either. I know that any alternate solution will still run out of memory but I'm just trying to see where the problem is.
P.S. I don't want a solution that would change the way I'm approaching the actual problem. I know convex optimization is the way to go and I have already done enough work to get up until here.
P.P.S I saw the other question regarding the open source alternatives. That's not what I'm looking for. I'm looking for more efficient ones, if someone faced the same problem adn shifted to a better solver.
Hmmm...
Without further information, I'd guess that fmincon runs out of memory because it needs the Hessian (which, given that your decision variable is 10^4, will be 10^4 x numel(f(x1,x2,x3,....)) large).
It also takes a lot of time to determine the values of the Hessian, because fmincon normally uses finite differences for that if you don't specify derivatives explicitly.
There's a couple of things you can do to speed things up here.
If you know beforehand that there will be a lot of zeros in your Hessian, you can pass sparsity patterns of the Hessian matrix via HessPattern. This saves a lot of memory and computation time.
If it is fairly easy to come up with explicit formulae for the Hessian of your objective function, create a function that computes the Hessian and pass it on to fmincon via the HessFcn option in optimset.
The same holds for the gradients. The GradConstr (for your non-linear constraint functions) and/or GradObj (for your objective function) apply here.
There's probably a few options I forgot here, that could also help you. Just go through all the options in the optimization toolbox' optimset and see if they could help you.
If all this doesn't help, you'll really have to switch optimizers. Given that fmincon is the pride and joy of MATLAB's optimization toolbox, there really isn't anything much better readily available, and you'll have to search elsewhere.
TOMLAB is a very good commercial solution for MATLAB. If you don't mind going to C or C++...There's SNOPT (which is what TOMLAB/SNOPT is based on). And there's a bunch of things you could try in the GSL (although I haven't seen anything quite as advanced as SNOPT in there...).
I don't know on what version of MATLAB you have, but I know for a fact that in R2009b (and possibly also later), fmincon has a few real weaknesses for certain types of problems. I know this very well, because I once lost a very prestigious competition (the GTOC) because of it. Our approach turned out to be exactly the same as that of the winners, except that they had access to SNOPT which made their few-million variable optimization problem converge in a couple of iterations, whereas fmincon could not be brought to converge at all, whatever we tried (and trust me, WE TRIED). To this day I still don't know exactly why this happens, but I verified it myself when I had access to SNOPT. Once, when I have an infinite amount of time, I'll find this out and report this to the MathWorks. But until then...I lost a bit of trust in fmincon :)

fit data(measurements) with numerical datasets

I have some data for which I have a set of numerically determined model curves. Now I would like to find the one with least square deviation, I only need to vary one parameter, which is the amplitude of these model curves.
I used fitting with analytic functions, but I did not find a way to handle such a problem.
Is there any solution?
Thanks a lot!
One of the optimize functions should do the trick. You can also read the section on optimization in the manual. Without any specifics on the data or the model you wish to match, it's hard to recommend anything more specific. For example, if your cost function has many maxima and minima or is not differentiable, you'll have to choose some of the more expensive routines.