NaN when using constrained minimization in Matlab? - matlab

I am trying to maximize a function of 10 variables. I do this by the following piece of code:
b = [1 2 3 4 1 7 -1 -4 9 1].'; %// initial value of the decision variable
goal = #(b) -sum(F(b));
Because I want to maximize F for a given set of parameters b, of which the last value should be ≥ 0. I implemented the optimization as follows:
%// lower bounds for decision variable
lb = [-inf(1,9) 0];
%// upper bounds for decision variable
ub = inf(1,10);
myfminoptions = optimset('Display','Iter', 'Algorithm','active-set');
theta1 = fmincon(goal, theta0, [],[],[],[], lb,ub, [], myfminoptions);
That is, I find the minimum of the negative of the function, which should be the same as finding the maximum of that function.
My problem is, that for every iteration I get that the first-order optimality is Inf and thus 'Hessian not updated'.
Additionally the value of my function NaN in the first iteration, which I simply don't understand. This might be the cause, but I not sure.
Edit
This is the function that I am using:
function ll = mytobit(theta);
global x y;
b=theta(1:size(theta,1)-1);
s=theta(size(theta,1));
ll = (y==0).*log(1-normcdf(x*b/s))+ (y>0).*(-0.5*(log(2*pi)+log(s^2)+(y-x*b).^2/s^2));
return;
When calling the function for the theta0, I get N likelihood values (N is the number of rows in X and Y) which I am supposed to.
An example of Y and X:
Y = [0;0;2047;1890;1975]
X = [2300, 34, 1156, 0, 1;
2100, 35, 1225, 0, 1;
2760, 36, 1296, 1, 0;
2300, 37, 1369, 1, 0
2455, 38, 1444, 0, 0]

You should probably be using the 'sqp' algorithm, as opposed to active-set. The sqp algorithm can recover from NaNs produced when s strays to close to zero, where the derivatives are undefined.
Also, you would hopefully not be initializing the s variable near zero in theta0.
Because s is actually supposed to be strictly greater than zero, it may help to re-express it as s=exp(-d/2) where d is a new unconstrained unknown variable. Also, your objective function only depends on b through the quantity B=b/s. So, you could formulate this way
ll = (y==0).*log(1-normcdf(x*B))+ (y>0).*(-0.5*(log(2*pi)-d+(y*exp(-d/2)-x*B).^2));
Now there are no ill-defined regions in the function or its derivatives.

Related

determine lag between two vector

I want to find the minimum amount of lag between two vector , I mean the minimum distance that something is repeated in vector based on another one
for example for
x=[0 0 1 2 2 2 0 0 0 0]
y=[1 2 2 2 0 0 1 2 2 2]
I want to obtain 4 for x to y and obtain 2 for y to x .
I found out a finddelay(x,y) function that works correctly only for x to y (it gives -4 for y to x).
is there any function that only give me lag based on going to the right direction of the vector? I will be so thankful if you'd mind helping me to get this result
I think this may be a potential bug in finddelay. Note this excerpt from the documentation (emphasis mine):
X and Y need not be exact delayed copies of each other, as finddelay(X,Y) returns an estimate of the delay via cross-correlation. However this estimated delay has a useful meaning only if there is sufficient correlation between delayed versions of X and Y. Also, if several delays are possible, as in the case of periodic signals, the delay with the smallest absolute value is returned. In the case that both a positive and a negative delay with the same absolute value are possible, the positive delay is returned.
This would seem to imply that finddelay(y, x) should return 2, when it actually returns -4.
EDIT:
This would appear to be an issue related to floating-point errors introduced by xcorr as I describe in my answer to this related question. If you type type finddelay into the Command Window, you can see that finddelay uses xcorr internally. Even when the inputs to xcorr are integer values, the results (which you would expect to be integer values as well) can end up having floating-point errors that cause them to be slightly larger or smaller than an integer value. This can then change the indices where maxima would be located. The solution is to round the output from xcorr when you know your inputs are all integer values.
A better implementation of finddelay for integer values might be something like this, which would actually return the delay with the smallest absolute value:
function delay = finddelay_int(x, y)
[d, lags] = xcorr(x, y);
d = round(d);
lags = -lags(d == max(d));
[~, index] = min(abs(lags));
delay = lags(index);
end
However, in your question you are asking for the positive delays to be returned, which won't necessarily be the smallest in absolute value. Here's a different implementation of finddelay that works correctly for integer values and gives preference to positive delays:
function delay = finddelay_pos(x, y)
[d, lags] = xcorr(x, y);
d = round(d);
lags = -lags(d == max(d));
index = (lags <= 0);
if all(index)
delay = lags(1);
else
delay = lags(find(index, 1)-1);
end
end
And here are the various results for your test case:
>> x = [0 0 1 2 2 2 0 0 0 0];
>> y = [1 2 2 2 0 0 1 2 2 2];
>> [finddelay(x, y) finddelay(y, x)] % The default behavior, which fails to find
% the delays with smallest absolute value
ans =
4 -4
>> [finddelay_int(x, y) finddelay_int(y, x)] % Correctly finds the delays with the
% smallest absolute value
ans =
-2 2
>> [finddelay_pos(x, y) finddelay_pos(y, x)] % Finds the smallest positive delays
ans =
4 2

summation of exponential distribution with different parameters

I just calculated a summation of two exponential distritbution with different lambda.
It's known that summmation of exponential distributions is Erlang(Gamma) distribution.
However, when lamdbas are different, result is a litte bit different.
Anyway look at the following equations.
Now, problem is (alpha_1 λ_2-alpha_2 λ_1).
(alpha_1 λ_2-alpha_2 λ_1) becomes 0
Thus, last two terms go to infinite....
Is that true??
I make some simple matlab code for verification.
clc;
clear;
mu=[1 2];
a1 = mu(1)/(mu(1)+mu(2));
a2 = mu(2)/(mu(1)+mu(2));
n = 10^6;
x = exprnd(mu(1), [1, n]);
y = exprnd(mu(2), [1, n]);
z = a1*x + a2*y;
figure
histfit(z, 100 ,'gamma')`
The figure is pdf of Z=alpha_1 * X + alpha_2 * Y.
This case is λ_1 = 1, λ_1=2. (The red line is gamma distribution.)
The result of matlab shows random variable Z is not infinite value.
What is the problom in my calculations??
I got the problem in my integral calculation. In the 6th row, e^-(lambda2-alpha2*lambda1/alpha1) = 1, thus, there is no term alpha1/(alpha1*lambda2-alpha2*lambda1) in the 7th row.

matlab: matrix symbolic solution

Thanks much for your time and for all your help.
Actually, I made a mistake in the previous post when specifying the problem. Thus, I reformulate my question using a simpler example. I need to solve symbolically the equation Ct = Z/(P-I) or Ct*(P-I) = Z.
I already know the answer => Ct = [sigma, 1-sigma]
How to program "correctly" the code in order to get the solution
syms sigma;
Ct = sym('Ct',[1 2]);
%
P = [sigma 1-sigma;
sigma 1-sigma];
I = [1 0;
0 1];
Z = [0 0];
%
solve(Ct*(P-I) == Z);
So far, I get :
Z =
0 0
Warning: The solutions are parametrized by the symbols:
z = C_
In solve at 190
In test_matrix_sigma at 13
Or with
solve(Ct == Z/(P-I), Ct);
I get:
Warning: System is rank deficient. Solution is not unique.
Warning: 4 equations in 2 variables.
In /opt/MATLAB/R2013a/toolbox/symbolic/symbolic/symengine.p>symengine at 56
In mupadengine.mupadengine>mupadengine.evalin at 97
In mupadengine.mupadengine>mupadengine.feval at 150
In solve at 170
In test_matrix_sigma at 13
---------------------------------------------------------------------------------------
Thanks for the answer !
Now I have two issues:
1) When I try to handla a more complicated system:
syms a b P1 P2;
I = [1 0 0 0;
0 1 0 0;
0 0 1 0;
0 0 0 1];
%
P = [a*P1 (1-a)*P1 (1-b)*(1-P1) b*(1-P1);
a*P1 (1-a)*P1 (1-b)*(1-P1) b*(1-P1);
b*(1-P2) (1-b)*(1-P2) (1-b)*P2 b*(1-P2);
b*(1-P2) (1-b)*(1-P2) (1-b)*P2 b*(1-P2)];
%
assume(a, 'real');
assume(b, 'real');
assume(P1, 'real');
assume(P2, 'real');
%
answer = null((P-I)');
disp(answer);
I get
ans =
[ empty sym ]
as the only answer.
2) If there is a way in maltlab to "solve" the above symbolic matrix P and find the symbolic determinant ?
For instance, if I do eid(P) it works;
when I do det(P) it gives 0 as answer...
This post is an answer to a different problem, that was first asked by the OP before being edited. I leave the problem and solution here in case someone ever runs into the same problem:
I need to solve symbolically the following matrix equation to find out Ct (a vector ???):
syms a b P1 P2
%
P = [a*P1 (1-a)*P1 (1-b)*(1-P1) b*(1-P1);
a*P1 (1-a)*P1 (1-b)*(1-P1) b*(1-P1);
b*(1-P2) (1-b)*(1-P2) (1-b)*P2 b*(1-P2);
b*(1-P2) (1-b)*(1-P2) (1-b)*P2 b*(1-P2)];
%
solve(Ct*(P-1) == 0, Ct);
How to proceed ?
So far I get:
Undefined function or variable 'Ct'.
Error in matrix_test (line 10) solve(Ct*(P-1) == 0, Ct);
The error you get is because you did not assign Ct before trying to solve for your equation. In the equation Ct*(P-1) == 0, Matlab does not know what Ct is. You could remedy this by creating a symbolic vector (see sym documentation). For instance:
Ct = sym('Ct', [1 4]);
However, using solve on this would not give you the solutions you're looking for: instead, Matlab is going to give you the trivial answer Ct = 0, which of course is a correct answer to your equation.
What you really want to find is the null space of the (P-1)' matrix: the null space is the set of vectors X such that (P-1)'X = 0 (Which is the same thing as X'(P-1) = 0, so Ct = X'). The Matlab function null (see doc) is what you need. Using your code, I get:
null((P-1)')
ans =
[ -1, 0]
[ 1, 0]
[ 0, -1]
[ 0, 1]
This means that any linear combination of the vectors [-1, 1, 0, 0] and [0, 0, -1, 1] belong to the null space of (P-1)', and therefore its transpose is the Ct you were looking for.
N.B.: This result is easily confirmed by observation of your initial matrix P.
This edited problem is only slightly different from the first one. Once again, you are looking to solve an homogeneous system of linear equations.
The warnings you get simply warn you of that: there are an infinity of solutions ; The first warning tells you that the answer is parameterized by C_ (meaning it's in the complex plane, you could add assume(sigma, 'real') and assume(Ct, 'real') if you wanted, you'd get an answer parameterized by R_.
The solution is to find the null space of the matrix (P-I)', as for the previous problem.
null((P-I)')
ans =
-sigma/(sigma - 1)
1
Now if your vector Z became different from 0, you would need to add the particular solution, that is Z/(P-I). In the present case, it gives:
Z/(P-I)
Warning: System is rank deficient. Solution is not unique.
ans =
[ 0, 0]
This means that in this case the particular solution is [0 0], and the result of null gives you the homogeneous solution. Remember that the complete solution of a linear system of equations is the sum of the particular solution + a linear combination of the elements of the null space. A way to express this in Matlab could be:
syms lambda real;
sol = Z/(P-I) + lambda * null((P-I)')'
sol =
[ -(lambda*sigma)/(sigma - 1), lambda]

Generate matrix with for-loop in matlab

Say I have two functions f(x), g(x), and a vector:
xval=1:0.01:2
For each of these individual x values, I want to define a vector of y-values, covering the y-interval bounded by the two functions (or possibly a matrix where columns are x-values, and rows are y-values).
How would I go about creating a loop that would handle this for me? I have absolutely no idea myself, but I'm sure some of you have something right up your sleeve. I've been sweating over this problem for a few hours by now.
Thanks in advance.
Since you wish to generate a matrix, I assume the number of values between f(x) and g(x) should be the same for every xval. Let's call that number of values n_pt. Then, we also know what the dimensions of your result matrix rng will be.
n_pt = 10;
xval = 1 : 0.01 : 2;
rng = zeros(n_pt, length(xval));
Now, into the loop. Once we know what the y-values returned by f(x) and g(x) are, we can use linspace to give us n_pt equally spaced points between them.
for n = 1 : length(xval)
y_f = f(xval(n))
y_g = g(xval(n))
rng(:, n) = linspace(y_f, y_g, n_pt)';
end
This is nice because with linspace you don't need to worry about whether y_f > y_g, y_f == y_g or y_f < y_g. That's all taken care of already.
For demsonstration, I run this example for xval = 1 : 0.1 : 2 and the two sinusoids f = #(x) sin(2 * x) and g = #(x) sin(x) * 2. The points are plotted using plot(xval, rng, '*k');.

Update only one matrix element for iterative computation

I have a 3x3 matrix, A. I also compute a value, g, as the maximum eigen value of A. I am trying to change the element A(3,3) = 0 for all values from zero to one in 0.10 increments and then update g for each of the values. I'd like all of the other matrix elements to remain the same.
I thought a for loop would be the way to do this, but I do not know how to update only one element in a matrix without storing this update as one increasingly larger matrix. If I call the element at A(3,3) = p (thereby creating a new matrix Atry) I am able (below) to get all of the values from 0 to 1 that I desired. I do not know how to update Atry to get all of the values of g that I desire. The state of the code now will give me the same value of g for all iterations, as expected, as I do not know how to to update Atry with the different values of p to then compute the values for g.
Any suggestions on how to do this or suggestions for jargon or phrases for me to web search would be appreciated.
A = [1 1 1; 2 2 2; 3 3 0];
g = max(eig(A));
% This below is what I attempted to achieve my solution
clear all
p(1) = 0;
Atry = [1 1 1; 2 2 2; 3 3 p];
g(1) = max(eig(Atry));
for i=1:100;
p(i+1) = p(i)+ 0.01;
% this makes a one giant matrix, not many
%Atry(:,i+1) = Atry(:,i);
g(i+1) = max(eig(Atry));
end
This will also accomplish what you want to do:
A = #(x) [1 1 1; 2 2 2; 3 3 x];
p = 0:0.01:1;
g = arrayfun(#(x) eigs(A(x),1), p);
Breakdown:
Define A as an anonymous function. This means that the command A(x) will return your matrix A with the (3,3) element equal to x.
Define all steps you want to take in vector p
Then "loop" through all elements in p by using arrayfun instead of an actual loop.
The function looped over by arrayfun is not max(eig(A)) but eigs(A,1), i.e., the 1 largest eigenvalue. The result will be the same, but the algorithm used by eigs is more suited for your type of problem -- instead of computing all eigenvalues and then only using the maximum one, you only compute the maximum one. Needless to say, this is much faster.
First, you say 0.1 increments in the text of your question, but your code suggests you are actually interested in 0.01 increments? I'm going to operate under the assumption you mean 0.01 increments.
Now, with that out of the way, let me state what I believe you are after given my interpretation of your question. You want to iterate over the matrix A, where for each iteration you increase A(3, 3) by 0.01. Given that you want all values from 0 to 1, this implies 101 iterations. For each iteration, you want to calculate the maximum eigenvalue of A, and store all these eigenvalues in some vector (which I will call gVec). If this is correct, then I believe you just want the following:
% Specify the "Current" A
CurA = [1 1 1; 2 2 2; 3 3 0];
% Pre-allocate the values we want to iterate over for element (3, 3)
A33Vec = (0:0.01:1)';
% Pre-allocate a vector to store the maximum eigenvalues
gVec = NaN * ones(length(A33Vec), 1);
% Loop over A33Vec
for i = 1:1:length(A33Vec)
% Obtain the version of A that we want for the current i
CurA(3, 3) = A33Vec(i);
% Obtain the maximum eigen value of the current A, and store in gVec
gVec(i, 1) = max(eig(CurA));
end
EDIT: Probably best to paste this code into your matlab editor. The stack-overflow automatic text highlighting hasn't done it any favors :-)
EDIT: Go with Rody's solution (+1) - it is much better!