fminbnd doesn't give the minimum value - matlab

I'm trying some built-in functions in MATLAB. I declared a function like this:
function y = myFunction(x)
y = cos(4*x) .* sin(10*x) .* exp(-abs(x));
end
Then I use fminbnd to find the minimum value:
fminbnd(#myFunction,-pi,pi)
This gives me the result:
ans =
0.7768
However, when I plot 'myFunction' in [-pi,pi], I got the following figure with this code that I used:
>> x = -pi:0.01:pi;
>> y = myFunction(x);
>> plot(x,y)
It can be seen that the min value is -0.77, which is not the result given by fminbnd. What's wrong here? I'm new to MATLAB and I don't know where I'm wrong.

First things first, fminbnd returns the x-coordinate of the minimum location of your function. As such, the actual minimum is located at myFunction(0.7768). x=0.7768 is the location of where the minimum is.
Now, I tried running your code with more verbose information. Specifically, I wanted to see how the minimum changes at each iteration. I overrode the default settings of fminbnd so we can see what's happening at each iteration.
This is what I get:
>> y = #(x) cos(4*x).*sin(10*x).*exp(-abs(x)); %// No need for function declaration
>> options = optimset('Display', 'iter');
>> [X,FVAL,EXITFLAG] = fminbnd(y, -pi, pi, options)
Func-count x f(x) Procedure
1 -0.741629 0.42484 initial
2 0.741629 -0.42484 golden
3 1.65833 -0.137356 golden
4 0.775457 -0.457857 parabolic
5 1.09264 0.112139 parabolic
6 0.896609 -0.163049 golden
7 0.780727 -0.457493 parabolic
8 0.7768 -0.457905 parabolic
9 0.776766 -0.457905 parabolic
10 0.776833 -0.457905 parabolic
Optimization terminated:
the current x satisfies the termination criteria using OPTIONS.TolX of 1.000000e-04
X =
0.776799595407872
FVAL =
-0.457905463395071
EXITFLAG =
1
X is the location of the minimum, FVAL is the y value of where the minimum is and EXITFLAG=1 means that the algorithm converged properly.
This obviously is not equal to your desired minimum. If I can reference the documentation of fminbnd, it specifically says this:
fminbnd may only give local solutions.
Going with that, the reason why you aren't getting the right answer is because you have a lot of local minima in your function. Specifically, if you zoom in to x=0.7784 this itself is a local minimum:
Since the algorithm managed to find a good local minimum here, it decides to stop.
I managed to get the true minimum if you restrict the search boundaries of the function to be around where the true minimum is. Instead of [-pi,pi]... try something like [-1,1] instead:
>> [X,FVAL,EXITFLAG] = fminbnd(y, -1, 1, options)
Func-count x f(x) Procedure
1 -0.236068 -0.325949 initial
2 0.236068 0.325949 golden
3 -0.527864 -0.256217 golden
4 -0.32561 0.0218758 parabolic
5 -0.0557281 -0.487837 golden
6 0.0557281 0.487837 golden
7 -0.124612 -0.734908 golden
8 -0.134743 -0.731415 parabolic
9 -0.126213 -0.735006 parabolic
10 -0.126055 -0.735007 parabolic
11 -0.126022 -0.735007 parabolic
12 -0.126089 -0.735007 parabolic
Optimization terminated:
the current x satisfies the termination criteria using OPTIONS.TolX of 1.000000e-04
X =
-0.126055418940111
FVAL =
-0.735007134768142
EXITFLAG =
1
When I did this, I managed to the get the right minimum location and the minimum itself.

While this is only a partial answer I will just point out the following text that is in the Limitations section of the documentation of fminbnd:
fminbnd may only give local solutions.
Which is what is happening in your case. Often, when there is a function with multiple minima* optimization algorithms cant find the global minimum.
Generally the best approach when there are lots of minima is to split the function in 2, compute the minimum of both parts and then compare to see which one is smaller.
*you can find if you function has multiple minima by computing the derivative and checking the amount of zero crosses of the derivative and dividing by two

Related

Calculate the autocorrelation of a time series created from a normal distribution

I generate a time series from a normal distribution and then I try to plot the autocorrelation by using the following code snippet:
ts1 = normrnd(0,0.25,1,100);
autocorrelation_ts1 = xcorr(ts1);
I was expecting that the autocorrelation would show 1 for x=0 and almost 0 for the rest of values, instead I get value 6 at axis position 100.
I think the question applies both to Matlab and Octave but I am not sure.
First thing is that your second line of code is wrong. I think you meant to put
autocorrelation_ts1 = xcorr(ts1);
Other than this, I think your solution is correct. The reason the max value is at 100 and not 0 is because a temporal shift of 0 in the autocorrelation actually happens on the 100th iteration of the correlation function. In other words, the numbers on the X axis don't correspond to time.
To get time on the X axis change your code to
[autocorrelation_ts1, shifts] = xcorr(ts1);
Then
plot(shifts, autocorrelation_ts1)
With regard to the max value, matlab documentation for xcorr indicates that 1 is not the maximum output value of the function when called without the normalization argument. If you want to normalize such that all values are 1 or less, use
[autocorrelation_ts1, shifts] = xcorr(ts1, 'normalized');
Just as complementary reference to Scott's answer, this is the complete code snippet, including stem chart scaling to show up to 20 shifts/lags.
[auto_ts1, lags] = xcorr(ts1);
ts_begin = ceil(size(lags,2)/2);
ts_end = ts_begin + 20;
stem(lags(ts_begin:ts_end),auto_ts1(ts_begin:ts_end)/max(auto_ts1), 'linewidth', 4.0, 'filled')

Unreasonable [positive] log-likelihood values from matlab "fitgmdist" function

I want to fit a data sets with Gaussian mixture model, the data sets contains about 120k samples and each sample has about 130 dimensions. When I use matlab to do it, so I run scripts (with cluster number 1000):
gm = fitgmdist(data, 1000, 'Options', statset('Display', 'iter'), 'RegularizationValue', 0.01);
I get the following outputs:
iter log-likelihood
1 -6.66298e+07
2 -1.87763e+07
3 -5.00384e+06
4 -1.11863e+06
5 299767
6 985834
7 1.39525e+06
8 1.70956e+06
9 1.94637e+06
The log likelihood is bigger than 0! I think it's unreasonable, and don't know why.
Could somebody help me?
First of all, it is not a problem of how large your dataset is.
Here is some code that produces similar results with a quite small dataset:
options = statset('Display', 'iter');
x = ones(5,2) + (rand(5,2)-0.5)/1000;
fitgmdist(x,1,'Options',options);
this produces
iter log-likelihood
1 64.4731
2 73.4987
3 73.4987
Of course you know that the log function (the natural logarithm) has a range from -inf to +inf. I guess your problem is that you think the input to the log (i.e. the aposteriori function) should be bounded by [0,1]. Well, the aposteriori function is a pdf function, which means that its value can be very large for very dense dataset.
PDFs must be positive (which is why we can use the log on them) and must integrate to 1. But they are not bounded by [0,1].
You can verify this by reducing the density in the above code
x = ones(5,2) + (rand(5,2)-0.5)/1;
fitgmdist(x,1,'Options',options);
this produces
iter log-likelihood
1 -8.99083
2 -3.06465
3 -3.06465
So, I would rather assume that your dataset contains several duplicate (or very close) values.

Trying to produce exponential traffic

I'm trying to simulate an optical network algorithm in MATLAB for a homework project. Most of it is already done, but I have an issue with the diagrams I'm getting.
In the simulation I'm generating exponential traffic, however, for low lambda values (0.1) I'm getting very high packet drop rates (99%). I wrote a sample here which is very close to the testbench I'm running on my simulator.
% Run the simulation 10 times, with different lambda values
l = [1 2 3 4 5 6 7 8 9 10];
for i=l(1):l(end)
X = rand();
% In the 'real' simulation the following line defines the time
% when the next packet generation event will occur. Suppose that
% i is the current time
t_poiss = i + ceil((-log(X)/(i/10)));
distr(i)=t_poiss;
end
figure, plot(distr)
axis square
grid on;
title('Exponential test:')
The resulting image is
The diagram I'm getting in this sample is IDENTICAL to the diagram I'm getting for the drop rate/λ. So I would like to ask if I'm doing something wrong or if I miss something? Is this the right thing to expect?
So the problem is coming from might be a numerical problem. Since you are generating a random number for X, the number might be incredibly small - say, close to zero. If you have a number close to zero numerically, log(X) is going to be HUGE. So your calculation of t_poiss will be huge. I would suggest doing something like X = rand() + 1 to make sure that X is never close to zero.

Matlab division, cannot get back the answer

H = [1 2; 3 4; 5 6; 7 8; 9 10; 11 12; 13 14; 15 16];
X = [7; 9];
Y = H*X;
H1 = Y/X;
This is my code. As you can see, I was trying to get back the H values. However, it gave me something else. I have tried to use inv() but this is not possible because X is not a square matrix.
You can't get a value of rank 2 back by dividing a value of rank 1. The system is underconstrained.
Both mrdivide and pinv (for pseudo-inverse) can be used to get a solution to the system. Because there are multiple solutions, it wouldn't necessary be the one you started with. Instead you'll get a "simplest" solution, either in the sense of lowest cardinality or lowest 2-norm, depending on whether you use mrdivide or pinv.
Here, the pinv documentation page probably explains it more precisely than I can. Just note it is discussing X\Y instead of Y/X:
If A has more rows than columns and is not of full rank, then the overdetermined least squares problem
minimize norm(A*x-b)
does not have a unique solution. Two of the infinitely many solutions are
x = pinv(A)*b
and
y = A\b
These two are distinguished by the facts that norm(x) is smaller than the norm of any other solution and that y has the fewest possible nonzero components.

Optimization with discrete parameters in Matlab

I have 12 sets of vectors (about 10-20 vectors each) and i want to pick one vector of each set so that a function f that takes the sum of these vectors as argument is maximized. In addition i have constraints for some components of that sum.
Example:
a_1 = [3 2 0 5], a_2 = [3 0 0 2], a_3 = [6 0 1 1], ... , a_20 = [2 12 4 3]
b_1 = [4 0 4 -2], b_2 = [0 0 1 0], b_3 = [2 0 0 4], ... , b_16 = [0 9 2 3]
...
l_1 = [4 0 2 0], l_2 = [0 1 -2 0], l_3 = [4 4 0 1], ... , l_19 = [3 0 9 0]
s = [s_1 s_2 s_3 s_4] = a_x + b_y + ... + l_z
Constraints:
s_1 > 40
s_2 < 100
s_4 > -20
Target: Chose x, y, ... , z to maximize f(s):
f(s) -> max
Where f is a nonlinear function that takes the vector s and returns a scalar.
Bruteforcing takes too long because there are about 5.9 trillion combinations, and since i need the maximum (or even better the top 10 combinations) i can not use any of the greedy algorithms that came to my mind.
The vectors are quite sparse, about 70-90% are zeros. If that is helping somehow ...?
The Matlab Optimization toolbox didnt help either since it doesnt much support for discrete optimization.
Basically this is a lock-picking problem, where the lock's pins have 20 distinct positions, and there are 12 pins. Also:
some of the pin's positions will be blocked, depending on the positions of all the other pins.
Depending on the specifics of the lock, there may be multiple keys that fit
...interesting!
Based on Rasman's approach and Phpdna's comment, and the assumption that you are using int8 as data type, under the given constraints there are
>> d = double(intmax('int8'));
>> (d-40) * (d+100) * (d+20) * 2*d
ans =
737388162
possible vectors s (give or take a few, haven't thought about +1's etc.). ~740 million evaluations of your relatively simple f(s) shouldn't take more than 2 seconds, and having found all s that maximize f(s), you are left with the problem of finding linear combinations in your vector set that add up to one of those solutions s.
Of course, this finding of combinations is no easy feat, and the whole method breaks down anyway if you are dealing with
int16: ans = 2.311325368800510e+018
int32: ans = 4.253529737045237e+037
int64: ans = 1.447401115466452e+076
So, I'll discuss a more direct and more general approach here.
Since we're talking integers and a fairly large search space, I'd suggest using a branch-and-bound algorithm. But unlike the bintprog algorithm, you'd have to use different branching strategies, and of course, these should be based on a non-linear objective function.
Unfortunately, there is nothing like this in the optimization toolbox (or the File Exchange as far as I could find). fmincon is a no-go, since it uses gradient and Hessian information (which will usually be all-zero for integers), and fminsearch is a no-go, since you'll need a really good initial estimate, and the rate of convergence is (roughly) O(N), meaning, for this 20-dimensional problem you'll have to wait quite long before convergence, without the guarantee of having found the global solution.
An interval method could be a possibility, however, I personally have very little experience with this. There is no native interval-related stuff in MATLAB or any of its toolboxes, but there's the freely available INTLAB.
So, if you're not feeling like implementing your own non-linear binary integer programming algorithm, or are not in the mood for an adventure with INTLAB, there's really only one thing left: heuristic methods. In this link there is a similar situation, with an outline of the solution: use the genetic algorithm (ga) from the Global Optimization toolbox.
I would implement the problem roughly like so:
function [sol, fval, exitflag] = bintprog_nonlinear()
%// insert your data here
%// Any sparsity you may have here will only make this more
%// *memory* efficient, not *computationally*
data = [...
... %// this will be an array with size 4-by-20-by-12
... %// (or some permutation of that you find more intuitive)
];
%// offsets into the 3D array to facilitate indexing a bit
offsets = bsxfun(#plus, ...
repmat(1:size(data,1), size(data,3),1), ...
(0:size(data,3)-1)' * size(data,1)*size(data,2)); %//'
%// your objective function
function val = obj(X)
%// limit "X" to integers in [1 20]
X = min(max(round(X),1),size(data,3));
%// "X" will be a collection of 12 integers between 0 and 20, which are
%// indices into the data matrix
%// form "s" from "X"
s = sum(bsxfun(#plus, offsets, X*size(data,1) - size(data,1)));
%// XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxX
%// Compute the NEGATIVE VALUE of your function here
%// XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxX
end
%// your "non-linear" constraint function
function [C, Ceq] = nonlcon(X)
%// limit "X" to integers in [1 20]
X = min(max(round(X),1),size(data,3));
%// form "s" from "X"
s = sum(bsxfun(#plus, offsets, X(:)*size(data,1) - size(data,1)));
%// we have no equality constraints
Ceq = [];
%// Compute inequality constraints
%// NOTE: solver is trying to solve C <= 0, so:
C = [...
40 - s(1)
s(2) - 100
-20 - s(4)
];
end
%// useful GA options
options = gaoptimset(...
'UseParallel', 'always'...
...
);
%// The rest really depends on the specifics of the problem.
%// Useful to look at will be at least 'TolCon', 'Vectorized', and of course,
%// 'PopulationType', 'Generations', etc.
%// THE OPTIMZIATION
[sol, fval, exitflag] = ga(...
#obj, size(data,3), ... %// objective function, taking a vector of 20 values
[],[], [],[], ... %// no linear (in)equality constraints
1,size(data,2), ... %// lower and upper limits
#nonlcon, options); %// your "nonlinear" constraints
end
Note that even though your constraints are essentially linear, the way by which you must compute the value for your s necessitates the use of a custom constraint function (nonlcon).
Especially note that this is currently (probably) a sub-optimal way to use ga -- I don't know the specifics of your objective function, so a lot more may be possible. For instance, I currently use a simple round() to convert the input X to integers, but using 'PopulationType', 'custom' (with a custom 'CreationFcn', 'MutationFcn' etc.) might produce better results. Also, 'Vectorized' will likely speed things up a lot, but I don't know whether your function is easily vectorized.
And yes, I use nested functions (I just love those things!); it prevents these huge, usually identical lists of input arguments if you use sub-functions or stand-alone functions, and they can really be a performance boost because there is little copying of data. But, I realize that their scoping rules make them somewhat akin to goto constructs, and so they are -ahum- "not everyone's cup of tea"...you might want to convert them to sub-functions to prevent long and useless discussions with your co-workers :)
Anyway, this should be a good place to start. Let me know if this is useful at all.
Unless you define some intelligence on how the vector sets are organized, there will be no intelligent way of solving your problem other then pure brute force.
Say you find s s.t. f(s) is max given constraints of s, you still need to figure out how to build s with twelve 4-element vectors (an overdetermined system if there ever was one), where each vector has 20 possible values. Sparsity may help, although I'm not sure how it is possible to have a vector with four elements be 70-90% zero, and sparsity would only be useful if there was some yet to be described methodology in how the vector are organized
So I'm not saying you can't solve the problem, I'm saying you need to rethink how the problem is set-up.
I know, this answer is reaching you really late.
Unfortunately, the problem, as is, show not many patterns to be exploited, besides of brute force -Branch&Bound, Master& Slave, etc.- Trying a Master Slave approach -i.e. solving first the function continuous nonlinear problem as master, and solving the discrete selection as slave could help, but with as many combinations, and without any more information over the vectors, there is not too much space for work.
But based on the given continuous almost everywhere functions, based on combinations of sums and multiplication operators and their inverses, the sparsity is a clear point to be exploited here. If 70-90% of vectors are zero, almost a good part of the solution space will be close to zero, or close to infinite. Hence a 80-20 pseudo solution would discard easily the 'zero' combinations, and use only the 'infinite' ones.
This way, the brute-force could be guided.