fmincon with lower bound fails, even though solution is at initial point - matlab

I'm trying to minimize a non-linear objective function (my actual function is much more complicated than that, but I found that even this simple function illustrates the point), where I know that minimum is obtained at the initial point x0:
fun = #(x) x(1)^2+x(2)^2;
x0 = [0 0];
lb1 = [0 0];
lb2 = [-1 -1];
[xc1 fvalc1] = fmincon(fun, x0, [],[],[],[], lb1, [Inf Inf])
Which outputs:
>> xc1 = 1.0e-03 * [0.6457 0.6457]
>> fvalc1 = 8.3378e-07
However, both using a different lower bound or using fminsearch instead work correctly:
[xc2 fvalc2] = fmincon(fun, x0, [],[],[],[], lb2, [Inf Inf])
>> xc2 = [0 0]
>> fvalc2 = 0
[xs fvals] = fminsearch(fun, x0)
>> xs = [0 0]
>> fvals = 0
What goes wrong in the first fmincon call?

We can diagnose this using the output output argument as specified in the docs
[xc1, fvalc1, ~, output] = fmincon(fun, x0, [],[],[],[], lb1, [Inf Inf])
The value output.stepsize is the final step size taken in the iterative solving process. In this case:
output.stepsize
>> ans = 6.586e-4
The estimated minima was at x = [6.457e-4, 6.457e-4] and the lower bounds you've permitted are [0 0], so the solver is not permitted to take another step! Another step would give x = [-1.29e-5, -1.29e-5] which is outside of the boundaries.
When you allow the lower bounds to be [-1, -1] the solver can over-shoot the minimum and approach it from all directions.
Moreover, we can use the options input to get even better insight!
options.Display = 'iter';
[xc1, fvalc1, ~, output] = fmincon(fun, x0, [],[],[],[], lb1, [Inf Inf], [], options);
Printed to the command window we see this:
Your initial point x0 is not between bounds lb and ub; FMINCON
shifted x0 to strictly satisfy the bounds.
First-order Norm of
Iter F-count f(x) Feasibility optimality step
0 3 1.960200e+00 0.000e+00 9.900e-01
1 6 1.220345e-02 0.000e+00 8.437e-01 1.290e+00
2 9 4.489374e-02 0.000e+00 4.489e-02 1.014e-01
3 12 1.172900e-02 0.000e+00 1.173e-02 1.036e-01
4 15 3.453565e-03 0.000e+00 3.454e-03 4.953e-02
5 18 1.435780e-03 0.000e+00 1.436e-03 2.088e-02
6 21 4.659097e-04 0.000e+00 4.659e-04 1.631e-02
7 24 2.379407e-04 0.000e+00 2.379e-04 6.160e-03
8 27 6.048934e-05 0.000e+00 6.049e-05 7.648e-03
9 30 1.613884e-05 0.000e+00 1.614e-05 3.760e-03
10 33 5.096660e-06 0.000e+00 5.097e-06 1.760e-03
11 36 2.470360e-06 0.000e+00 2.470e-06 6.858e-04
12 39 8.337765e-07 0.000e+00 8.338e-07 6.586e-04
So your x0 is invalid! This is why the solver doesn't return the result with 1 iteration and lower bounds of [0 0].
fminsearch also works for the same reason - you've not imposed a lower bound on which the solution sits.

Related

Find intersections of a general vector with zeros MATLAB

Consider a general vector which represent some non-linear function
for example:
x = [1 2 3 4 5 6 7 8 9 10];
f = [-1 6 8 7 5 2 0.1 -2 -3];
Is there a method in matlab that can find the solutions of f(x)=0? with some given accuracy
If you think about it, when you have a random distribution f, finding zeros can only be done with linear interpolation between the data points:
For your example, I would define a function myFunc as:
function y = myFunc(val)
x = [1 2 3 4 5 6 7 8 9 10];
f = [-1 6 8 7 5 2 0.1 -2 -3 3];
P = griddedInterpolant (x, f, 'linear', 'linear');
y = P(val);
end
and apply a root searching algorithm via something like fzero:
val = 0;
x = [1 2 3 4 5 6 7 8 9 10];
x = [-inf x inf]; % Look outside boundary too
fun = #myFunc;
sol = zeros(1, numel(x)-1);
cnt = 0;
for i = 1:length(x)-1 % fzero stops at the 1st zero hence the loop over each interval
bound = [x(i) x(i+1)];
try
z = fzero(fun, bound);
cnt = cnt+1;
sol(cnt) = z;
catch
% No answer within the boundary
end
end
sol(cnt+1:end) = [];
Maybe you can try interp1 in arrayfun like below (linear interpolation was adopted)
x0 = arrayfun(#(k) interp1(f(k:k+1),x(k:k+1),0),find(sign(f(1:end-1).*f(2:end))<0));
such that
x0 =
1.1429 7.0476 9.5000
DATA
x = [1 2 3 4 5 6 7 8 9 10];
f = [-1 6 8 7 5 2 0.1 -2 -3 3];
I've made a function that does it, but feel it is something quite "regular" that matlab must have built in answers...so if someone has any write it down and I will accept it as an answer.
function sol = find_zeros(x,f)
f_vec = round(f*10^2)/10^2;
ind=find(diff(sign(f_vec))~=0);
K = length(ind);
if (K>0)
sol = zeros(1,K);
for k=1:K
if (f_vec(ind(k))<f_vec(ind(k)+1))
df = f_vec(ind(k)):0.01:f_vec(ind(k)+1);
else
df = flip(f_vec(ind(k)+1):0.01:f_vec(ind(k)));
end
dx = linspace(x(ind(k)),x(ind(k)+1),length(df));
j = find(df==0);
sol(k) = dx(j);
end
else
sol=[];
end
sol=unique(sol);
end

How to compute 1D convolution in Matlab?

Suppose I have 2 vectors, data vector:
x=[2 1 2 1]
and weights vector
y=[1 2 3]
I want Matlab to convolve these vectors in sense of 1D neural network, i.e. run y as window against x and compute convolutions:
If I run built-in function conv then I get
>> conv(x,y)
ans =
2 5 10 8 8 3
which contains correct values in the middle but has something unknown at margins. Manual for conv function looks completely different with what I want.
If I run
>> conv(x,y, 'same')
ans =
5 10 8 8
I also get something strange.
You were very close to solving it by specifying the 3rd input to conv, but instead of 'same' you should've used 'valid':
x = [2 1 2 1];
y = [1 2 3];
conv(x,y,'valid')
ans =
10 8
Just reverse the filter:
x = [2,1,2,1];
y = [1,2,3];
z = conv(x,flip(y),'valid');

Defining a multivariable function for vector inputs

I am a bit new in using Matlab, and I have a question about defining a multivariable function for vector input.
If the function is a single function, say f(t), I know how to make it for vector input. The general way is to use arrayfun after defining a f(t). How about for multivariable function, say f(x,y)? What I want to do is to get two inputs, say [1 2 3] for x and [4 5 6 7] for y (dimension may be different, but both of them are either column vector or row vector) so that I can calculate to give
[f(1,4),f(1,5),f(1,6),f(1,7);
f(2,4),f(2,5),f(2,6),f(2,7);
f(3,4),f(3,5),f(3,6),f(3,7)]
The difficulty is that the vector input for x and y may not be in the same dimension.
I understand it may be difficult to illustrate if I do not have an example of f(x,y). For my use of f(x,y), it may be very complicated to display f(x,y). For simplicity, treat f(x,y) to be x^2+y, and once defined, you cannot change it to x.^2+y for vector inputs.
Here is a set of suggestions using ndgrid:
testfun = #(x,y) x^2+y; % non-vectorized form
x = 1:3;
y = 4:7;
[X,Y] = ndgrid(x,y);
% if the function can be vectorized (fastest!):
testfun_vec = #(x,y) x.^2+y; % vectorized form
A = testfun_vec(X,Y);
% or without ndgrid (also super fast):
B = bsxfun(testfun_vec,x.',y); % use the transpose to take all combinations
% if not, or if it's not bivariate operation (slowest):
C = arrayfun(testfun,X(:),Y(:));
C = reshape(C,length(x),length(y));
% and if you want a loop:
D = zeros(length(x),length(y));
for k = 1:length(X(:))
D(k) = testfun(X(k),Y(k));
end
Which will output for all cases (A,B,C and D):
5 6 7 8
8 9 10 11
13 14 15 16
As mentioned already, if you can vectorize your function - this is the best solution, and if it has only two inputs bsxfun is also a good solution. Otherwise if you have small data and want to keep your code compact use arrayfun, if you are dealing with large arrays use an un-nested for loop.
Here is the code using for loops and inline functions:
x = [1 2 3];
y = [4 5 6 7];
f = #(x,y) x^2 +y;
A = zeros(length(x), length(y));
for m = 1:length(x)
for n = 1:length(y)
A(m, n) = f(x(m), y(n));
end
end
disp(A);
Result:
A =
5 6 7 8
8 9 10 11
13 14 15 16
>> x = [1 2 3];
>> y = [4 5 6 7];
>> outValue = foo(x, y);
>> outValue
outValue =
5 6 7 8
8 9 10 11
13 14 15 16
Make this function:
function out = foo(x, y)
for i = 1 : length(x)
for j = 1 : length(y)
out(i, j) = x(i)^2 + y(j);
end
end

The im2col algorithm for ND input

I am trying to write my own im2col algorithm for input dimensions > 2D.
Currently I am looking at the Matlab im2col implementation. However, I cannot find any documentation regarding what is going on for any input of more than 2 dimensions.
I do get an output if I feed in a 3D tensor into the function. However I don't really understand how you get from 2D to ND. The fact that this isn't mentioned in the documentation suggests that its something straightforward, still, I don't get it.
Heck, I dont even understand why the size of the output matrix is the size it is.
Let me just start by saying that im2col is only intended for 2D matrices. The fact that it sometimes worked (and by that I mean returned a result without throwing an error) is just a happy coincidence.
Now I took a peek at edit im2col.m, and without studying the code too much, the first line of each of the distinct and sliding methods should give you an intuition of what's happening:
...
if strcmp(kind, 'distinct')
[m,n] = size(a);
...
elseif strcmp(kind,'sliding')
[ma,na] = size(a);
...
end
...
First recall that [s1,s2] = size(arr) where arr is a 3d array will collapse the size of 2nd and 3rd dimension into one size. Here's the relevant doc size:
[d1,d2,d3,...,dn] = size(X) returns the sizes of the dimensions of the array X, provided the number of output arguments n equals ndims(X). If n < ndims(X), di equals the size of the ith dimension of X for 0<i<n, but dn equals the product of the sizes of the remaining dimensions of X, that is, dimensions n through ndims(X).
So basically for an array of size M-by-N-by-P, the function instead thinks it's a matrix of size M-by-(N*P). Now MATLAB has some quirky indexing rules that lets you do things like:
>> x = reshape(1:4*3*2,4,3,2)
x(:,:,1) =
1 5 9
2 6 10
3 7 11
4 8 12
x(:,:,2) =
13 17 21
14 18 22
15 19 23
16 20 24
>> x(:,:)
ans =
1 5 9 13 17 21
2 6 10 14 18 22
3 7 11 15 19 23
4 8 12 16 20 24
which is what I think ended up happening here. Here is an example to confirm the behavior of im2col on an RGB image:
% normal case (grayscale image)
>> M = magic(5);
>> B1 = im2col(M, [3 3], 'sliding');
% (RGB image)
>> MM = cat(3, M, M+50, M+100);
>> B2 = im2col(MM, [3 3], 'sliding');
>> B3 = im2col(reshape(MM, [5 5*3]), [3 3], 'sliding');
>> assert(isequal(B2,B3))
Note that B2 and B3 are equal, so basically think of the result of im2col on an array arr = cat(3,R,G,B) to be the same as that of arr = cat(2,R,G,B) (concatenated horizontally).
Interestingly, you won't get so lucky with "distinct" blocks method:
>> B1 = im2col(M, [3 3], 'distinct') % works
% ..snip..
>> B2 = im2col(MM, [3 3], 'distinct') % errors
Subscripted assignment dimension mismatch.
Error in im2col (line 59)
aa(1:m,1:n) = a;
Now that we understand what was happening, let's think how to do this properly for 3D arrays.
In my opinion to implement im2col for color images, I would just run it on each color channel separately (each being a 2d matrix), and concatenate the result along the third dimension. So something like this wrapper function:
function B = im2col_rgb(img, sz, varargin)
B = cell(1,size(img,3));
for i=1:size(img,3)
B{i} = im2col(img(:,:,i), sz, varargin{:});
end
B = cat(3, B{:});
end

Functional matrix vectorization in MATLAB

To vectorize a matrix in MATLAB, you can execute this simple command:
A = reshape(1:9, 3, 3)
% A =
% [1 4 7]
% [2 5 8]
% [3 6 9]
b = A(:)
% b = [1 2 3 4 5 6 7 8 9]'
But how about if you have a matrix that you want to first slice, then vectorize? How do you go about doing this without assigning to a temporary variable?
Let's say A is now:
A = reshape(1:27, 3, 3, 3)
% A(:,:,1) =
% [1 4 7]
% [2 5 8]
% [3 6 9]
% A(:,:,2) =
% [10 13 16]
% [11 14 17]
% [12 15 18]
% A(:,:,3) =
% [19 22 25]
% [20 23 26]
% [21 24 27]
If you run
b = A(:,:,1)(:)
% Error: ()-indexing must appear last in an index expression.
Is there some function, vectorize(A) that gives this functionality?
b = vectorize(A(:,:,1))
% b = [1 2 3 4 5 6 7 8 9]'
Or if not a function, is there an alternative method than
tmp = A(:,:,1)
b = tmp(:)
Thanks in advance for the guidance!
If only elegance could be measured, but here's one to get through the night -
A(1:numel(A(:,:,1))).'
This is a function that I've seen many seasoned Matlab users add to their code hoard by hand:
function A = vectorize(A)
A = A(:);
% save this code as vectorize.m
Once you've got vectorize.m on your path, then you can do what you want in one line.
You can define the function inline if you prefer:
vectorize = inline( 'A(:)' );
but then of course you have to ensure that that's in memory for every session.
If for some reason it's unacceptable to write and save your own functions to disk (if so, I wonder how your sanity ever survives using Matlab, but it takes all sorts...) then the following code snippet is a one-liner that uses only builtins, works for arbitrarily high-dimensional A, and is still not too unreadable:
reshape( A, numel(A), 1 )
Note that this, in common with (:) but contrary to what you assume in your question, produces a column vector. Its disadvantage is that A must already be assigned in the workspace, and that assignment may require one extra line. By contrast, the function version can work even on unnamed outputs of other operations—e.g.:
A = vectorize( randn(5) + magic(5) )
One-liner for arbitrary indices.
i=3;
A((i-1)*numel(A(:,:,i))+(1:numel(A(:,:,i)))).'