Why does the rowsize of A matter in fmincon - matlab

I have a Matlab code, which use fmincon with some constraints. So that I am able to modify the code I have thought about whether the line position within the condition matrix A makes a difference
I set up a test file so I can change some variables. It turns out that the position of the condition is irrelevant for the result, but the number of rows in A and b plays a role. I´m suprised by that because I would expect that a row with only zeros in A and b just cancel out.
fun = #(x)100*(x(2)-x(1)^2)^2 + (1-x(1))^2;
options1 = optimoptions('fmincon','Display','off');
A=zeros(2,2); %setup A
A(2,2)=1; %x2<0
b=[0 0]'; %setup b
x = fmincon(fun,[-1,2],A,b,[],[],[],[],[],options1);x
%change condition position inside A
A=zeros(2,2);
A(1,2)=1; %x2<0
b=[0 0]';
x = fmincon(fun,[-1,2],A,b,[],[],[],[],[],options1);x
% no change; the position doesn´t influence fmincon
%change row size of A
A=zeros(1,2);
A(1,2)=1; %x2<0
b=[0]';
x = fmincon(fun,[-1,2],A,b,[],[],[],[],[],options1);x
%change in x2
%increase size of A
A=zeros(10,2);
A(1,2)=1; %x2<0
b=[0 0 0 0 0 0 0 0 0 0]';
x = fmincon(fun,[-1,2],A,b,[],[],[],[],[],options1);x
%change in x2
Can someone explain to me why fmincon is influenced by the row number? What is the "right" rownumber in A and b? The number of variables or the number of conditions?
EDIT
For reasons of completeness:
I agree that different values are possible because of the iteration process. Nevertheless I can find situations where the difference is bigger than the tolerance:
Added +log(x(2) to the function:
fun = #(x)100*(x(2)-x(1)^2)^2 + (1-x(1))^2+log(x(3));
options1 = optimoptions('fmincon','Display','off');
options = optimoptions('fmincon')
A=zeros(2,3); %setup A
A(2,3)=1; %x2<0
b=[0 0]'; %setup b
x = fmincon(fun,[-1,2,1],A,b,[],[],[],[],[],options1);x
%change row size of A
A=zeros(1,3);
A(1,3)=1; %x2<0
b=[0]';
x = fmincon(fun,[-1,2,1],A,b,[],[],[],[],[],options1);x
%change in x2
%increase size of A
A=zeros(10,3);
A(1,3)=1; %x2<0
b=[0 0 0 0 0 0 0 0 0 0]';
x = fmincon(fun,[-1,2,1],A,b,[],[],[],[],[],options1);x
%change in x2
x =
-0.79876 **0.49156** 2.3103e-11
x =
-0.79921 0.49143 1.1341e-11
x =
-0.80253 **0.50099** 5.8733e-12
Matlab support told me that the A matrix should not have more rows than conditions. Each condition makes it more difficult for the algorithm.

Note that fmincom doesn't necessarily give the exact solution but a good approximation of the solution according to a certain criteria.
The difference in results are plausible since fminconis an iterative algorithm and these matrix multiplications (even if there are mainly zeros) will eventually end with different results. Matlab will actually do these matrix multiplications until he finds the best result. So these results are all correct in the sense they are all close to the solution.
x =
0.161261791015350 -0.000000117317860
x =
0.161261791015350 -0.000000117317860
x =
0.161261838607809 -0.000000077614999
x =
0.161261877075196 -0.000000096088746
The difference in your results is around 1.0e-07 which is decent result considering you don't specify stopping criteria. You can see what you have by default with the command
options = optimoptions('fmincon')
My result is
Default properties:
Algorithm: 'interior-point'
CheckGradients: 0
ConstraintTolerance: 1.0000e-06
Display: 'final'
FiniteDifferenceStepSize: 'sqrt(eps)'
FiniteDifferenceType: 'forward'
HessianApproximation: 'bfgs'
HessianFcn: []
HessianMultiplyFcn: []
HonorBounds: 1
MaxFunctionEvaluations: 3000
MaxIterations: 1000
ObjectiveLimit: -1.0000e+20
OptimalityTolerance: 1.0000e-06
OutputFcn: []
PlotFcn: []
ScaleProblem: 0
SpecifyConstraintGradient: 0
SpecifyObjectiveGradient: 0
StepTolerance: 1.0000e-10
SubproblemAlgorithm: 'factorization'
TypicalX: 'ones(numberOfVariables,1)'
UseParallel: 0
For example, I can reach closer results with the option:
options1 = optimoptions('fmincon','Display','off', 'OptimalityTolerance', 1.0e-09);
Result is
x =
0.161262015455003 -0.000000000243997
x =
0.161262015455003 -0.000000000243997
x =
0.161262015706777 -0.000000000007691
x =
0.161262015313928 -0.000000000234186
You can also try and play with other criteria MaxFunctionEvaluations, MaxFunctionEvaluations etc to see if you can have even closer results...

Related

How to get indexes of logical matrix without using find in matlab?

Let's assume my matrix A is the output of comparison function i.e. logical matrix having values 0 and 1's only. For a small matrix of size 3*4, we might have something like:
A =
1 1 0 0
0 0 1 0
0 0 1 1
Now, I am generating another matrix B which is of the same size as A, but its rows are filled with indexes of A and any leftover values in each row are set to zero.
B =
1 2 0 0
3 0 0 0
3 4 0 0
Currently, I am using find function on each row of A to get matrix B. Complete code can be written as:
A=[1,1,0,0;0,0,1,0;0,0,1,1];
[rows,columns]=size(A);
B=zeros(rows,columns);
for i=1:rows
currRow=find(A(i,:));
B(i,1:length(currRow))=currRow;
end
For large martixes, "find" function is taking time in the calculation as per Matlab Profiler. Is there any way to generate matrix B faster?
Note:
Matrix A is having more than 1000 columns in each row but non-zero elements are never more than 50. Here, I am taking Matrix B as the same size as A but Matrix B can be of much smaller size column-wise.
I would suggest using parfor, but the overhead is too much here, and there are more issues with it, so it is not a good solution.
rows = 5e5;
cols = 1000;
A = rand(rows, cols) < 0.050;
I = uint16(1:cols);
B = zeros(size(A), 'uint16');
% [r,c] = find(A);
tic
for i=1:rows
% currRow = find(A(i,:));
currRow = I(A(i,:));
B(i,1:length(currRow)) = currRow;
end
toc
#Cris suggests replacing find with an indexing operation. It increases the performance by about 10%.
Apparently, there is not a better optimization unless B is required to be in that specific form you tell. I suggest using [r,c] = find(A); if the indexes are not required in a matrix form.

How to Implement Box Function in better way in Matlab?

This function in Matlab presents Box_Function:
output = 1 while sample n belongs to the range (-a, +a).
otherwise the output is 0 outside that range.
So how can I implement this function in Matlab in a better way to shift the plot in case of negative values of time, without assigning negative values to the array.
Thanks in Advance
function B_X = Box_Func(N,K,a)
if(N <= 0)||(K+a > N)
warning('Please Enter Valid Positive Integer !');
else
B_X = zeros([-N N]);
for i = -N : 1 : N
if (i >= K-a) && (i <= K+a)
B_X(i)=1;
end
end
end
end
Your question is unclear since it does not really explain what you want to do, and in one comment you state that you know where the error comes from. I suggest to read the docs (also in a comment), but here I'll show you my problems with your code, provide some simple ways of testing your code and I hope this helps to solve your problem and to understand how to ask better questions.
First, one remark to the lines
if(N <= 0)||(K+a > N) % if samples Number wrong, or shifting exceeds limit
% of Samples Print a warning.
warning('Please Enter Valid Positive Integer !');
I suggest to throw an error instead of a warning if the input parameters are wrong and will lead to an error anyway. Otherwise you could omit the test and let Matlab throw the respective error.
The next misunderstanding is
B_X = zeros([-N N])
What do you expect B_X to be after this line if, let's say, N=2? Test if the result is what you expect by simply entering this command in the command line directly:
>> zeros([-2 2])
ans =
0×2 empty double matrix
I guess that's not what you expect. As the docs state, zeros(N) will yield a square matrix with N rows and N columns; zeros(M,N) will yield a matrix with M rows and N columns. Look:
>> zeros(2)
ans =
0 0
0 0
>> zeros(2,1)
ans =
0
0
I do not know what you expect from zeros([-2 2]), but I guess that you are looking for one of the following:
>> N = 2;
>> zeros(2*N+1,1)
ans =
0
0
0
0
0
>> zeros(1,2*N+1)
ans =
0 0 0 0 0
>> zeros(2*N+1)
ans =
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
My guess is that somehow you expect the function zeros to operate on some range of indices you provide. Your misunderstanding might be that you expect zeros([-2 2]) to provide a vector of zeros into which you can index using -2:2 (that is, one of -2,-1,0,+1,+2). If you assume this, your assumption is wrong.
I guess this from the line
for i = -N : 1 : N
in your code. Due to this line, I'll first thought that
>> B_X = zeros(1, 2*N+1)
B_X =
0 0 0 0 0
is what you expect. However, from the comment
% if samples Number wrong, or shifting exceeds limit
I guessed that N might be just the number of data points in the result. This would mean
>> B_X = zeros(1, N)
B_X =
0 0
(which would not make much sense for N=2). So, for the next question you ask (or an edit to this question): Clearly explain the meaning of the function inputs!
Since later, you set the limits of your x-axes to [-N N], I'll keep my first assumption, thus the number of data points (and therefore the argument to zeros) should be 2*N+1.
The next argument to your function is K, you call it the shift. k only occurs in combination with the third input of the function, a. You do not provide any inforamtion about a.
From this line I guess that a is something that specifies a width:
if (i >= K-a) && (i <= K+a)
Now, slowly, if one also considers
B_X(i)=1;
and the usage of the word box (and heaviside, which is still in the comments), one can get a clue of what you want to do. Together with your comment that you want your function
to appear on the plotting shifted on X Axis, so it looks like that the center of the Box Function is in the negative area of X axis
Might this be your goal: I want to plot a vector from -N to N (in steps of 1) with zero values except for the region of -K±a, where I want it to be one?
If this is the case, one attempt would be as follows (it remains to you to put it into a function):
>> N=15;
>> K=7;
>> a = 3;
Get the x-values:
>> x = -N:N;
(-15, -14, ... 14, 15). Next, allocate B_X:
>> B_X = zeros(1, 2*N+1);
Last, use logical indexing (this might help to understand this) to set the values around -K±a to one:
>> B_X(x>(-K-a) & x<(-K+a)) = 1;
Eventually, plot the result:
plot(x,B_X);
and adjust the axis limits:
>> ax=gca;
>> ax.YLim = [-.2 1.2];
Result is:
function B_X = Box_Func(N,K,a)
% Box_Func This Function takes Number of Samples N,Count of shift K, and
% half of Box width a then it stem Them ,
% Note it works only for positive shifting
% that means K should be positive or Zero
if(N <= 0)||(K+a > N) % if samples Number wrong, or shifting exceeds limit
% of Samples Print a warning.
warning('Please Enter Valid Positive Integer !');
else % if The inputs are fine , then :
B_X = zeros([1 2*N+1]);
x = -N:N;
B_X(x>=(K-a) & x<=(K+a)) = 1;
end
%===========================================
% Plotting the Results
%===========================================
figure('Name','Box Function','NumberTitle','off');
stem(x,B_X)
hold on
xlabel('Samples')
ylabel('Box Shifeted Function')
xlim([-N N]) ;
ylim([-1 2]);
grid on
hold off
end

MATLAB: efficient generation of a large integer matrix of multi-indices

Let d and p be two integers. I need to generate a large matrix A of integers, having d columns and N=nchoosek(d+p,p) rows. Note that nchoosek(d+p,p) increases quickly with d and p, so it's very important that I can generate A quickly. The rows of A are all the multi-indices with components from 0 to p, such that the sum of the components is less than or equal to p. This means that, if d=3 and p=3, then A is an [N=nchoosek(3+3,3)=20x3] matrix with the following structure:
A=[0 0 0;
1 0 0;
0 1 0;
0 0 1;
2 0 0;
1 1 0;
1 0 1;
0 2 0;
0 1 1;
0 0 2;
3 0 0;
2 1 0;
2 0 1;
1 2 0;
1 1 1;
1 0 2;
0 3 0;
0 2 1;
0 1 2;
0 0 3]
It is not indispensable to follow exactly the row ordering I used, although it would make my life easier (for those interested, it's called graded lexicographical ordering and it's described here:
http://en.wikipedia.org/wiki/Monomial_order).
In case you are curious about the origin of this weird matrix, let me know!
Solution using nchoosek and diff
The following solution is based on this clever answer by Mark Dickinson.
function degrees = monomialDegrees(numVars, maxDegree)
if numVars==1
degrees = (0:maxDegree).';
return;
end
degrees = cell(maxDegree+1,1);
k = numVars;
for n = 0:maxDegree
dividers = flipud(nchoosek(1:(n+k-1), k-1));
degrees{n+1} = [dividers(:,1), diff(dividers,1,2), (n+k)-dividers(:,end)]-1;
end
degrees = cell2mat(degrees);
You can get your matrix by calling monomialDegrees(d,p).
Solution using nchoosek and accumarray/histc
This approach is based on the following idea: There is a bijection between all k-multicombinations and the matrix we are looking for. The multicombinations give the positions, where the entries should be added. For example the multicombination [1,1,1,1,3] will be mapped to [4,0,1], as there are four 1s, and one 3. This can be either converted using accumarray or histc. Here is the accumarray-approach:
function degrees = monomialDegrees(numVars, maxDegree)
if numVars==1
degrees = (0:maxDegree).';
return;
end
degrees = cell(maxDegree+1,1);
degrees{1} = zeros(1,numVars);
for n = 1:maxDegree
pos = nmultichoosek(1:numVars, n);
degrees{n+1} = accumarray([reshape((1:size(pos,1)).'*ones(1,n),[],1),pos(:)],1);
end
degrees = cell2mat(degrees);
And here the alternative using histc:
function degrees = monomialDegrees(numVars, maxDegree)
if numVars==1
degrees = (0:maxDegree).';
return;
end
degrees = cell(maxDegree+1,1);
degrees(1:2) = {zeros(1,numVars); eye(numVars);};
for n = 2:maxDegree
pos = nmultichoosek(1:numVars, n);
degrees{n+1} = histc(pos.',1:numVars).';
end
degrees = cell2mat(degrees(1:maxDegree+1));
Both use the following function to generate multicombinations:
function combs = nmultichoosek(values, k)
if numel(values)==1
n = values;
combs = nchoosek(n+k-1,k);
else
n = numel(values);
combs = bsxfun(#minus, nchoosek(1:n+k-1,k), 0:k-1);
combs = reshape(values(combs),[],k);
end
Benchmarking:
Benchmarking the above codes yields that the diff-solution is faster if your numVars is low and maxDegree high. If numVars is higher than maxDegree, then the histc solution will be faster.
Old approach:
This is an alternative to Dennis' approach of dec2base, which has a limit on the maximum base. It is still a lot slower than the above solutions.
function degrees = monomialDegrees(numVars, maxDegree)
Cs = cell(1,numVars);
[Cs{:}] = ndgrid(0:maxDegree);
degrees = reshape(cat(maxDegree+1, Cs{:}),(maxDegree+1)^numVars,[]);
degrees = degrees(sum(degrees,2)<=maxDegree,:);
I would solve it this way:
ncols=d;
colsum=p;
base=(0:colsum)';
v=#(dm)permute(base,[dm:-1:1]);
M=bsxfun(#plus,base,v(2));
for idx=3:ncols
M=bsxfun(#plus,M,v(idx));
end
L=M<=colsum;
A=cell(1,ncols);
[A{:}]=ind2sub(size(L),find(L));
a=cell2mat(A);
%subtract 1 because 1 based indexing but base starts at 0
a=a-1+min(base);
It builds up a p-dimensional matrix which contains the sum. The efficiency of this code depends on sum(L(:))/numel(L), this quotient tells you how much of the created matrix is actually used for solutions. If this gets low for your intput, there probably exits a better solution.
Here is a very easy way to do it:
L = dec2base(0:4^3-1,4);
idx=sum(num2str(L)-'0',2)<=3;
L(idx,:)
I think the first line can be very time efficient for creating a list of candidates, but unfortunately I don't know how to reduce the list in an efficient way after that.
So the second line works, but could use improvement performance wise.

How to make a general case of inserting ones in any type of matrix, in the non-principal diagonal

The title might be confusing, here's a particular example to explain myself. Also, I'm not sure how do you call the diagonal that starts in (1,2) and goes onward: (2,3) ; (3,4) and so on. Non-principal, non-main diagonal, not sure at all.
3x3 case
-1 1 0
-1 0 1
0 -1 1
4x4 case
-1 1 0 0
-1 0 1 0
-1 0 0 1
0 -1 1 0
0 -1 0 1
0 0 -1 1
So if the original matrix was a 4x4 (or any other size), I am able to make a matrix the size of the second example. I now have to insert the -1 and 1's in this fashion. This means n-1 number of -1's inserted if j=1, and then, a n-1 number of ones in the non-principal diagonal. When this is done, it's the same but for j=2 and the next non-principal diagonal, and so on.
Thing is, I'm thinking all the time about loops, and too many cases arise, because what I want is to be able to do this for any possible dimension, not for a particular case.
But then I saw this post Obtaining opposite diagonal of a matrix in Matlab
With this answer: A(s:s-1:end-1)
And it seems like a much cleaner way of doing it, since my own way (not finished since I'm not able to figure all the cases) has too many conditions. With a sentence like that, I could choose the diagonal, insert ones, and do it as many times as required, depending of the n dimension.
This leaves the problem of inserting the -1's, but I guess I could manage something.
It seems to mee that you want to obtain the following matrix B of size n × (n-1)*n/2
n = 4;
idx = fliplr(fullfact([n n]));
idx(diff(idx')<=0,:) = [];
m = size(idx,1);
B = zeros(m,n);
B(sub2ind(size(B),1:m,idx(:,1)')) = -1;
B(sub2ind(size(B),1:m,idx(:,2)')) = 1;
Approach #1
Here's a vectorized approach that has more memory requirements than a non-vectorized or for-loop based one. So, it could be tried out for small to medium sized datasizes.
The basic idea is this. For n=4 as an example, we take
-1 1 0 0
-1 0 1 0
-1 0 0 1
as the basic building block, replicate it n-1 i.e. 3 times and then remove the rows that aren't supposed to be part of the final output as per the requirements of the problem. Because of this very nature, this solution has more memory requirements, as we need to remove rows 6,8,9 for n = 4 case. But this gives us the opportunity to work with everything in one go.
N = n-1; %// minus 1 of the datasize, n
blksz = N*(N+1); %// number of elements in a (n-1)*n blocksize that is replicated
b1 = [-1*ones(N,1) eye(N)] %// Create that special starting (n-1)*n block
idx1 = find(b1~=0) %// find non zero elements for the starting block
idx2 = bsxfun(#plus,idx1,[0:N-1]*(blksz+N)) %// non zero elements for all blocks
b1nzr = repmat(b1(b1~=0),[1 N]) %// elements for all blocks
vald_ind = bsxfun(#le,idx2,[1:N]*blksz) %// positions of valid elements all blocks
mat1 = zeros(N,blksz) %// create an array for all blocks
mat1(idx2(vald_ind)) = b1nzr(vald_ind) %// put right elements into right places
%// reshape into a 3D array, join/concatenate along dim3
out = reshape(permute(reshape(mat1,N,N+1,[]),[1 3 2]),N*N,[])
%// remove rows that are not entertained according to the requirements of problem
out = out(any(out==1,2),:)
Approach #2
Here's a loop based code that could be easier to get a hold on if you have to explain it to yourself or just people and most importantly scales up pretty well on performance criteria across varying datasizes.
start_block = [-1*ones(n-1,1) eye(n-1)] %// Create that special starting (n-1)*n block
%// Find starting and ending row indices for each shifted block to be repeated
ends = cumsum([n-1:-1:1])
starts = [1 ends(1:end-1)+1]
out = zeros(sum(1:n-1),n) %// setup all zeros array to store output
for k1 = 1:n-1
%// Put elements from shifted portion of start_block for creating the output
out(starts(k1):ends(k1),k1:end) = start_block(1:n-k1,1:n-k1+1)
end
With n=4, the output -
out =
-1 1 0 0
-1 0 1 0
-1 0 0 1
0 -1 1 0
0 -1 0 1
0 0 -1 1
I don't know if I understood properly, but is this what you are looking for:
M=rand(5);
k=1; % this is to select the k-th diagonal
D=diag(ones(1,size(M,2)-abs(k)), k);
M(D==1)=-1;
M =
0.9834 -1.0000 0.8402 0.6310 0.0128
0.8963 0.1271 -1.0000 0.3164 0.6054
0.8657 0.6546 0.3788 -1.0000 0.5765
0.8010 0.8640 0.2682 0.4987 -1.0000
0.5550 0.2746 0.1529 0.7386 0.6550

Solving for variables in an over-parameterised system

I am trying to write a Matlab program that accepts variables for a system from the user, but there are more variables than system parameters. To be specific, six variables in three equations:
w - d - M = 0
l - d - T = 0
N - T + M = 0
This could be represented in matrix form as A*x=0 where
A = [1 0 0 -1 0 -1;
0 1 0 -1 -1 0;
0 0 1 0 -1 1];
x = [w l N d T M]';
I would like to be able to solve this system given a known subset of the variables. For example, if the user gives d, T, M, then the system is trivially solved for the other three variables. If the user supplies w, N, M, then it becomes a solvable 3-DOF system. And so on. (If the user over- or under-specifies the system then an error may of course result.)
Given any one of these combinations it's simple to (a priori) use matrix algebra to calculate the unknown quantities. But I don't know how to solve the general case, aside from using the symbolic toolbox (which I prefer not to do for compatibility reasons).
When I started with this approach I thought this step would be easy, but my linear algebra is rusty; am I missing something simple?
First, let x be a vector with NaN for the unknown values. This allows you to use ISNAN to find the indeces of the unknowns. If you calculate A*x for only the user-specified terms, that gives you a column of constants b. Take those constants to the right-hand side of the equation, and you have an equation of the form A*x = -b.
A = [1 0 0 -1 0 -1;
0 1 0 -1 -1 0;
0 0 1 0 -1 1];
idx = ~isnan(x);
b = A(:,idx)*x(idx); % user provided constants
z = A(:,~idx)\(-b); % solution of Ax = -b
x(~idx) = z;
With input x = [NaN NaN NaN 1 1 1]', for instance, you get the result [2 2 0 1 1 1]'. This uses MLDIVIDE, I'm not well versed enough in linear algebra to know whether PINV or something else would be better.
Given the linear system
A = [1 0 0 -1 0 -1;
0 1 0 -1 -1 0;
0 0 1 0 -1 1];
A*x = 0
Where the elements of x are identified as:
x = [w l N d T M]';
Now, suppose that {d,T,M} have known, fixed values. What we need are the indices of these elements in x. We've chosen the 4th, 5th and 6th elements of x to be knowns.
known_idx = [4 5 6];
unknown_idx = setdiff(1:6,known_idx);
Now, let me pick some arbitrary numbers for those known variables.
xknown = [1; -3; 7.5];
We will partition A into two submatrices, corresponding to the known and unknown variables.
Aknown = A(:,known_idx);
Aunknown = A(:,unknown_idx);
Now, move the known values to the right hand side of the equality, and solve. See that Aknown is a 3x3 matrix, so the problem is (hopefully) well posed.
xunknown = Aunknown\(-Aknown*xknown)
xunknown =
-8.5
2
10.5
Combine it all into the final solution.
x = zeros(6,1);
x(known_idx) = xknown;
x(unknown_idx) = xunknown;
x =
-8.5
2
10.5
1
-3
7.5
Note that I've expanded this all out into a few lines to show what is happening more clearly. But I could have done it all in just a line or two of code had I wanted to be parsimonious.
Finally, see that had I chosen some other sets of numbers to be the knowns, such as {l,d,T}, then the resulting system would be singular. So you must watch for that event. A test on the rank of Aunknown might be useful to weed out the problems. Or you might choose to employ pinv to build the solution.
The system of equations is fixed? What if you store the variables present in your three equations in a list per equation:
(w, d, M)
(l, d, T)
(N, T, M)
Then you get the user input and you can calculate the number of variables given in each equation:
User input: w, N, M
Given variables:
(w, d, M) -> 2
(l, d, T) -> 0
(N, T, M) -> 1
This would trivially give you d from the first equation. Therefore you end up with two equations containing two variables and you know you the equation system you have to solve.
It's basically your own simple symbolic solver for a single system of equations.