Solving for variables in an over-parameterised system - matlab

I am trying to write a Matlab program that accepts variables for a system from the user, but there are more variables than system parameters. To be specific, six variables in three equations:
w - d - M = 0
l - d - T = 0
N - T + M = 0
This could be represented in matrix form as A*x=0 where
A = [1 0 0 -1 0 -1;
0 1 0 -1 -1 0;
0 0 1 0 -1 1];
x = [w l N d T M]';
I would like to be able to solve this system given a known subset of the variables. For example, if the user gives d, T, M, then the system is trivially solved for the other three variables. If the user supplies w, N, M, then it becomes a solvable 3-DOF system. And so on. (If the user over- or under-specifies the system then an error may of course result.)
Given any one of these combinations it's simple to (a priori) use matrix algebra to calculate the unknown quantities. But I don't know how to solve the general case, aside from using the symbolic toolbox (which I prefer not to do for compatibility reasons).
When I started with this approach I thought this step would be easy, but my linear algebra is rusty; am I missing something simple?

First, let x be a vector with NaN for the unknown values. This allows you to use ISNAN to find the indeces of the unknowns. If you calculate A*x for only the user-specified terms, that gives you a column of constants b. Take those constants to the right-hand side of the equation, and you have an equation of the form A*x = -b.
A = [1 0 0 -1 0 -1;
0 1 0 -1 -1 0;
0 0 1 0 -1 1];
idx = ~isnan(x);
b = A(:,idx)*x(idx); % user provided constants
z = A(:,~idx)\(-b); % solution of Ax = -b
x(~idx) = z;
With input x = [NaN NaN NaN 1 1 1]', for instance, you get the result [2 2 0 1 1 1]'. This uses MLDIVIDE, I'm not well versed enough in linear algebra to know whether PINV or something else would be better.

Given the linear system
A = [1 0 0 -1 0 -1;
0 1 0 -1 -1 0;
0 0 1 0 -1 1];
A*x = 0
Where the elements of x are identified as:
x = [w l N d T M]';
Now, suppose that {d,T,M} have known, fixed values. What we need are the indices of these elements in x. We've chosen the 4th, 5th and 6th elements of x to be knowns.
known_idx = [4 5 6];
unknown_idx = setdiff(1:6,known_idx);
Now, let me pick some arbitrary numbers for those known variables.
xknown = [1; -3; 7.5];
We will partition A into two submatrices, corresponding to the known and unknown variables.
Aknown = A(:,known_idx);
Aunknown = A(:,unknown_idx);
Now, move the known values to the right hand side of the equality, and solve. See that Aknown is a 3x3 matrix, so the problem is (hopefully) well posed.
xunknown = Aunknown\(-Aknown*xknown)
xunknown =
-8.5
2
10.5
Combine it all into the final solution.
x = zeros(6,1);
x(known_idx) = xknown;
x(unknown_idx) = xunknown;
x =
-8.5
2
10.5
1
-3
7.5
Note that I've expanded this all out into a few lines to show what is happening more clearly. But I could have done it all in just a line or two of code had I wanted to be parsimonious.
Finally, see that had I chosen some other sets of numbers to be the knowns, such as {l,d,T}, then the resulting system would be singular. So you must watch for that event. A test on the rank of Aunknown might be useful to weed out the problems. Or you might choose to employ pinv to build the solution.

The system of equations is fixed? What if you store the variables present in your three equations in a list per equation:
(w, d, M)
(l, d, T)
(N, T, M)
Then you get the user input and you can calculate the number of variables given in each equation:
User input: w, N, M
Given variables:
(w, d, M) -> 2
(l, d, T) -> 0
(N, T, M) -> 1
This would trivially give you d from the first equation. Therefore you end up with two equations containing two variables and you know you the equation system you have to solve.
It's basically your own simple symbolic solver for a single system of equations.

Related

Why does the rowsize of A matter in fmincon

I have a Matlab code, which use fmincon with some constraints. So that I am able to modify the code I have thought about whether the line position within the condition matrix A makes a difference
I set up a test file so I can change some variables. It turns out that the position of the condition is irrelevant for the result, but the number of rows in A and b plays a role. I´m suprised by that because I would expect that a row with only zeros in A and b just cancel out.
fun = #(x)100*(x(2)-x(1)^2)^2 + (1-x(1))^2;
options1 = optimoptions('fmincon','Display','off');
A=zeros(2,2); %setup A
A(2,2)=1; %x2<0
b=[0 0]'; %setup b
x = fmincon(fun,[-1,2],A,b,[],[],[],[],[],options1);x
%change condition position inside A
A=zeros(2,2);
A(1,2)=1; %x2<0
b=[0 0]';
x = fmincon(fun,[-1,2],A,b,[],[],[],[],[],options1);x
% no change; the position doesn´t influence fmincon
%change row size of A
A=zeros(1,2);
A(1,2)=1; %x2<0
b=[0]';
x = fmincon(fun,[-1,2],A,b,[],[],[],[],[],options1);x
%change in x2
%increase size of A
A=zeros(10,2);
A(1,2)=1; %x2<0
b=[0 0 0 0 0 0 0 0 0 0]';
x = fmincon(fun,[-1,2],A,b,[],[],[],[],[],options1);x
%change in x2
Can someone explain to me why fmincon is influenced by the row number? What is the "right" rownumber in A and b? The number of variables or the number of conditions?
EDIT
For reasons of completeness:
I agree that different values are possible because of the iteration process. Nevertheless I can find situations where the difference is bigger than the tolerance:
Added +log(x(2) to the function:
fun = #(x)100*(x(2)-x(1)^2)^2 + (1-x(1))^2+log(x(3));
options1 = optimoptions('fmincon','Display','off');
options = optimoptions('fmincon')
A=zeros(2,3); %setup A
A(2,3)=1; %x2<0
b=[0 0]'; %setup b
x = fmincon(fun,[-1,2,1],A,b,[],[],[],[],[],options1);x
%change row size of A
A=zeros(1,3);
A(1,3)=1; %x2<0
b=[0]';
x = fmincon(fun,[-1,2,1],A,b,[],[],[],[],[],options1);x
%change in x2
%increase size of A
A=zeros(10,3);
A(1,3)=1; %x2<0
b=[0 0 0 0 0 0 0 0 0 0]';
x = fmincon(fun,[-1,2,1],A,b,[],[],[],[],[],options1);x
%change in x2
x =
-0.79876 **0.49156** 2.3103e-11
x =
-0.79921 0.49143 1.1341e-11
x =
-0.80253 **0.50099** 5.8733e-12
Matlab support told me that the A matrix should not have more rows than conditions. Each condition makes it more difficult for the algorithm.
Note that fmincom doesn't necessarily give the exact solution but a good approximation of the solution according to a certain criteria.
The difference in results are plausible since fminconis an iterative algorithm and these matrix multiplications (even if there are mainly zeros) will eventually end with different results. Matlab will actually do these matrix multiplications until he finds the best result. So these results are all correct in the sense they are all close to the solution.
x =
0.161261791015350 -0.000000117317860
x =
0.161261791015350 -0.000000117317860
x =
0.161261838607809 -0.000000077614999
x =
0.161261877075196 -0.000000096088746
The difference in your results is around 1.0e-07 which is decent result considering you don't specify stopping criteria. You can see what you have by default with the command
options = optimoptions('fmincon')
My result is
Default properties:
Algorithm: 'interior-point'
CheckGradients: 0
ConstraintTolerance: 1.0000e-06
Display: 'final'
FiniteDifferenceStepSize: 'sqrt(eps)'
FiniteDifferenceType: 'forward'
HessianApproximation: 'bfgs'
HessianFcn: []
HessianMultiplyFcn: []
HonorBounds: 1
MaxFunctionEvaluations: 3000
MaxIterations: 1000
ObjectiveLimit: -1.0000e+20
OptimalityTolerance: 1.0000e-06
OutputFcn: []
PlotFcn: []
ScaleProblem: 0
SpecifyConstraintGradient: 0
SpecifyObjectiveGradient: 0
StepTolerance: 1.0000e-10
SubproblemAlgorithm: 'factorization'
TypicalX: 'ones(numberOfVariables,1)'
UseParallel: 0
For example, I can reach closer results with the option:
options1 = optimoptions('fmincon','Display','off', 'OptimalityTolerance', 1.0e-09);
Result is
x =
0.161262015455003 -0.000000000243997
x =
0.161262015455003 -0.000000000243997
x =
0.161262015706777 -0.000000000007691
x =
0.161262015313928 -0.000000000234186
You can also try and play with other criteria MaxFunctionEvaluations, MaxFunctionEvaluations etc to see if you can have even closer results...

Trying to solve a system of linear equations in matlab

I've been set a question asking me to solve a system of linear equations. In the question it states I should set up a matrix A and column vector b to solve the equation Ax=b, where x is the column vector (w x y z).
A = [1 1 1 1; 0 1 4 -2; 2 0 -2 1; 1 -2 -1 1]
b = [28;7;22;-4]
A1 = inv(A).*b
sum(A1,2)
This is what I've done so far, however I know the answer that MATLAB gives me is incorrect, as the right solutions should be w=10.5, x=9, y=2.5, z=6.
Can someone point me in the right direction/ show me where I'm going wrong? (I'm fairly new to MATLAB so very unsure about it all).
Thanks.
A = [1 1 1 1; 0 1 4 -2; 2 0 -2 1; 1 -2 -1 1];
b = [28;7;22;-4];
A1 = A \ b;
ans = sum(A1,2);
For a reference concerning the \ operator, please read this: https://it.mathworks.com/help/matlab/ref/mldivide.html
The correct code to use your technique would be:
A1 = inv(A) * b;
but as you may notice, the Matlab code analyzer will point out that:
For solving a system of linear equations, the inverse of a
matrix is primarily of theoretical value. Never use the inverse of a
matrix to solve a linear system Ax=b with x=inv(A)*b, because it is
slow and inaccurate.
Replace inv(A)*b with A\b
Replace b*inv(A) with b/A
and that:
INV(A)*b can be slower than A\b

determine lag between two vector

I want to find the minimum amount of lag between two vector , I mean the minimum distance that something is repeated in vector based on another one
for example for
x=[0 0 1 2 2 2 0 0 0 0]
y=[1 2 2 2 0 0 1 2 2 2]
I want to obtain 4 for x to y and obtain 2 for y to x .
I found out a finddelay(x,y) function that works correctly only for x to y (it gives -4 for y to x).
is there any function that only give me lag based on going to the right direction of the vector? I will be so thankful if you'd mind helping me to get this result
I think this may be a potential bug in finddelay. Note this excerpt from the documentation (emphasis mine):
X and Y need not be exact delayed copies of each other, as finddelay(X,Y) returns an estimate of the delay via cross-correlation. However this estimated delay has a useful meaning only if there is sufficient correlation between delayed versions of X and Y. Also, if several delays are possible, as in the case of periodic signals, the delay with the smallest absolute value is returned. In the case that both a positive and a negative delay with the same absolute value are possible, the positive delay is returned.
This would seem to imply that finddelay(y, x) should return 2, when it actually returns -4.
EDIT:
This would appear to be an issue related to floating-point errors introduced by xcorr as I describe in my answer to this related question. If you type type finddelay into the Command Window, you can see that finddelay uses xcorr internally. Even when the inputs to xcorr are integer values, the results (which you would expect to be integer values as well) can end up having floating-point errors that cause them to be slightly larger or smaller than an integer value. This can then change the indices where maxima would be located. The solution is to round the output from xcorr when you know your inputs are all integer values.
A better implementation of finddelay for integer values might be something like this, which would actually return the delay with the smallest absolute value:
function delay = finddelay_int(x, y)
[d, lags] = xcorr(x, y);
d = round(d);
lags = -lags(d == max(d));
[~, index] = min(abs(lags));
delay = lags(index);
end
However, in your question you are asking for the positive delays to be returned, which won't necessarily be the smallest in absolute value. Here's a different implementation of finddelay that works correctly for integer values and gives preference to positive delays:
function delay = finddelay_pos(x, y)
[d, lags] = xcorr(x, y);
d = round(d);
lags = -lags(d == max(d));
index = (lags <= 0);
if all(index)
delay = lags(1);
else
delay = lags(find(index, 1)-1);
end
end
And here are the various results for your test case:
>> x = [0 0 1 2 2 2 0 0 0 0];
>> y = [1 2 2 2 0 0 1 2 2 2];
>> [finddelay(x, y) finddelay(y, x)] % The default behavior, which fails to find
% the delays with smallest absolute value
ans =
4 -4
>> [finddelay_int(x, y) finddelay_int(y, x)] % Correctly finds the delays with the
% smallest absolute value
ans =
-2 2
>> [finddelay_pos(x, y) finddelay_pos(y, x)] % Finds the smallest positive delays
ans =
4 2

Recursion in Matlab. why is my recursion not working past the first step?

So I am very new to Matlab and I have been tasked with implementing LU factorization. I have to do it recursively because we are not allowed to use for loops in our code, and recursion will give us optimal marks. I have written this code. The code works for the first step, and does what it is supposed to, but for the next two steps, the matrix is not modified at all. I have this code:
function[L, U] = myLU(A,B, pos)
%A = Mtrix that becomes U
%B = Matrix that becomes L
tmp_L = B;
[x,y] = size(A);
if pos > x
L = B;
U = A;
return
else
pos %<-- just to see if it iterates through everything
[tmp_U,tmp_L] = elimMat(A,pos);
myLU(tmp_U,tmp_L, pos+1);
end
L = tmp_L;
U = tmp_U;
end
I where elimMat(A, pos) returns the elimination matrix for column pos. as well as another matrix, which will end up being the matrix of multipliers. What i tried doing is then finding the LU factorization of this matrix A. since elimMat returns L and U(this works, if i do it manually it works), i had to make a function that allows me to do it automatically without using a for loop. I thought i would do it recursively. What i ended up doing is adding another variable B to the function so that i can store intermediate values of my matrix obtained in each step and put it all together later.
So here is my question. Am i implementing the recursion wrong? and if i am how can i fix it? The other thing i wanted to ask is how can i implement this so i do not need variable B as an additional imput, and only use the existing variables, or variables previously defined, to find the solution? I would really like only two inputs in my function: The matrix name and the starting index.
here is elimMat if if helps:
function [M,L] = elimMat(A,k)
%find the size of the matrix
[x,y] = size(A);
tmp_mat = zeros(x,y);
%M = The current matrix we are working on for Elimination -> going to
%become U.
%L = The L part of the matrix we are working on. Contains all the
%multipliers. This is going to be our L matrix.
for i = 1:x
mult = A(i,k)/A(k,k);
if i > k
tmp_mat(i,k) = mult;
P = A(k,:)*mult;
A(i,:) = A(i,:)-P;
elseif i == k
tmp_mat(k,k) = 1;
end
end
M = A;
L = tmp_mat;
end
thanks for any feedback you can provide.
Here is the output: WHAT I GET VS what it should be:
[U = VS [U =
1 2 2 1 2 2
0 -4 -6 0 -4 -6
0 -2 -4] 0 0 2
L = VS [L=
1 0 0 1 0 0
4 0 0 4 1 0
4 0 0] 4 0.5 1
As you can see only the first column is changed
You forgot to catch the output of your recursive call:
[tmp_L, tmp_U] = myLU(tmp_U,tmp_L, pos+1);
Matlab passes variables by value, so a function cannot change its input variable itself (ok, it can, but it's tricky and unsafe).
Your original version didn't return the updated matrices, so the outermost function call encountered the myLU() call, let the recursion unfold and finish, and then went on to use tmp_L and tmp_U as returned from the very first call to elimMAT(A,1).
Note that you might want to standardize your functions such that they return U and L in the same order to avoid confusion.

What is the Haskell / hmatrix equivalent of the MATLAB pos function?

I'm translating some MATLAB code to Haskell using the hmatrix library. It's going well, but
I'm stumbling on the pos function, because I don't know what it does or what it's Haskell equivalent will be.
The MATLAB code looks like this:
[U,S,V] = svd(Y,0);
diagS = diag(S);
...
A = U * diag(pos(diagS-tau)) * V';
E = sign(Y) .* pos( abs(Y) - lambda*tau );
M = D - A - E;
My Haskell translation so far:
(u,s,v) = svd y
diagS = diag s
a = u `multiply` (diagS - tau) `multiply` v
This actually type checks ok, but of course, I'm missing the "pos" call, and it throws the error:
inconsistent dimensions in matrix product (3,3) x (4,4)
So I'm guessing pos does something with matrix size? Googling "matlab pos function" didn't turn up anything useful, so any pointers are very much appreciated! (Obviously I don't know much MATLAB)
Incidentally this is for the TILT algorithm to recover low rank textures from a noisy, warped image. I'm very excited about it, even if the math is way beyond me!
Looks like the pos function is defined in a different MATLAB file:
function P = pos(A)
P = A .* double( A > 0 );
I can't quite decipher what this is doing. Assuming that boolean values cast to doubles where "True" == 1.0 and "False" == 0.0
In that case it turns negative values to zero and leaves positive numbers unchanged?
It looks as though pos finds the positive part of a matrix. You could implement this directly with mapMatrix
pos :: (Storable a, Num a) => Matrix a -> Matrix a
pos = mapMatrix go where
go x | x > 0 = x
| otherwise = 0
Though Matlab makes no distinction between Matrix and Vector unlike Haskell.
But it's worth analyzing that Matlab fragment more. Per http://www.mathworks.com/help/matlab/ref/svd.html the first line computes the "economy-sized" Singular Value Decomposition of Y, i.e. three matrices such that
U * S * V = Y
where, assuming Y is m x n then U is m x n, S is n x n and diagonal, and V is n x n. Further, both U and V should be orthonormal. In linear algebraic terms this separates the linear transformation Y into two "rotation" components and the central eigenvalue scaling component.
Since S is diagonal, we extract that diagonal as a vector using diag(S) and then subtract a term tau which must also be a vector. This might produce a diagonal containing negative values which cannot be properly interpreted as eigenvalues, so pos is there to trim out the negative eigenvalues, setting them to 0. We then use diag to convert the resulting vector back into a diagonal matrix and multiply the pieces back together to get A, a modified form of Y.
Note that we can skip some steps in Haskell as svd (and its "economy-sized" partner thinSVD) return vectors of eigenvalues instead of mostly 0'd diagonal matrices.
(u, s, v) = thinSVD y
-- note the trans here, that was the ' in Matlab
a = u `multiply` diag (fmap (max 0) s) `multiply` trans v
Above fmap maps max 0 over the Vector of eigenvalues s and then diag (from Numeric.Container) reinflates the Vector into a Matrix prior to the multiplys. With a little thought it's easy to see that max 0 is just pos applied to a single element.
(A>0) returns the positions of elements of A which are larger than zero,
so forexample, if you have
A = [ -1 2 -3 4
5 6 -7 -8 ]
then B = (A > 0) returns
B = [ 0 1 0 1
1 1 0 0]
Note that we have ones corresponding to an elemnt of A which is larger than zero, and 0 otherwise.
Now if you multiply this elementwise with A using the .* notation, then you are multipling each element of A that is larger than zero with 1, and with zero otherwise. That is, A .* B means
[ -1*0 2*1 -3*0 4*1
5*1 6*1 -7*0 -8*0 ]
giving finally,
[ 0 2 0 4
5 6 0 0 ]
So you need to write your own function that will return positive values intact, and negative values set to zero.
And also, u and v does not match in dimension, for a generall SVD decomposition, so you actually would need to REDIAGONALIZE pos(diagS - Tau), so that u* diagnonalized_(diagS -tau) agrres to v