Eigen Library - EigenSolver get different eigen value comparing with matlab eig function [duplicate] - matlab

Do anyone have any idea how can I rewrite eig(A,B) from Matlab used to calculate generalized eigenvector/eigenvalues? I've been struggling with this problem lately. So far:
Matlab definition of eig function I need:
[V,D] = eig(A,B) produces a diagonal matrix D of generalized
eigenvalues and a full matrix V whose columns are the corresponding
eigenvectors so that A*V = B*V*D.
So far I tried the Eigen library (http://eigen.tuxfamily.org/dox/classEigen_1_1GeneralizedSelfAdjointEigenSolver.html)
My implementation looks like this:
std::pair<Matrix4cd, Vector4d> eig(const Matrix4cd& A, const Matrix4cd& B)
{
Eigen::GeneralizedSelfAdjointEigenSolver<Matrix4cd> solver(A, B);
Matrix4cd V = solver.eigenvectors();
Vector4d D = solver.eigenvalues();
return std::make_pair(V, D);
}
But first thing that comes to my mind is, that I can't use Vector4cd as .eigenvalues() doesn't return complex values where Matlab does. Furthermore results of .eigenvectors() and .eigenvalues() for the same matrices are not the same at all:
C++:
Matrix4cd x;
Matrix4cd y;
pair<Matrix4cd, Vector4d> result;
for (int i = 0; i < 4; i++)
{
for (int j = 0; j < 4; j++)
{
x.real()(i,j) = (double)(i+j+1+i*3);
y.real()(i,j) = (double)(17 - (i+j+1+i*3));
x.imag()(i,j) = (double)(i+j+1+i*3);
y.imag()(i,j) = (double)(17 - (i+j+1+i*3));
}
}
result = eig(x,y);
cout << result.first << endl << endl;
cout << result.second << endl << endl;
Matlab:
for i=1:1:4
for j=1:1:4
x(i,j) = complex((i-1)+(j-1)+1+((i-1)*3), (i-1)+(j-1)+1+((i-1)*3));
y(i,j) = complex(17 - ((i-1)+(j-1)+1+((i-1)*3)), 17 - ((i-1)+(j-1)+1+((i-1)*3)));
end
end
[A,B] = eig(x,y)
So I give eig the same 4x4 matrices holding values 1-16 ascending (x) and descending (y). But I receive different results, furthermore Eigen method returns double from eigenvalues while Matlab returns complex dobule. I also find out that there is other Eigen solver named GeneralizedEigenSolver. That one in the documentation (http://eigen.tuxfamily.org/dox/classEigen_1_1GeneralizedEigenSolver.html) has written that it solves A*V = B*V*D but to be honest I tried it and results (matrix sizes) are not the same size as Matlab so I got quite lost how it works (examplary results are on the website I've linked). It also has only .eigenvector method.
C++ results:
(-0.222268,-0.0108754) (0.0803437,-0.0254809) (0.0383264,-0.0233819) (0.0995482,0.00682079)
(-0.009275,-0.0182668) (-0.0395551,-0.0582127) (0.0550395,0.03434) (-0.034419,-0.0287563)
(-0.112716,-0.0621061) (-0.010788,0.10297) (-0.0820552,0.0294896) (-0.114596,-0.146384)
(0.28873,0.257988) (0.0166259,-0.0529934) (0.0351645,-0.0322988) (0.405394,0.424698)
-1.66983
-0.0733194
0.0386832
3.97933
Matlab results:
[A,B] = eig(x,y)
A =
Columns 1 through 3
-0.9100 + 0.0900i -0.5506 + 0.4494i 0.3614 + 0.3531i
0.7123 + 0.0734i 0.4928 - 0.2586i -0.5663 - 0.4337i
0.0899 - 0.4170i -0.1210 - 0.3087i 0.0484 - 0.1918i
0.1077 + 0.2535i 0.1787 + 0.1179i 0.1565 + 0.2724i
Column 4
-0.3237 - 0.3868i
0.2338 + 0.7662i
0.5036 - 0.3720i
-0.4136 - 0.0074i
B =
Columns 1 through 3
-1.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i
0.0000 + 0.0000i -1.0000 - 0.0000i 0.0000 + 0.0000i
0.0000 + 0.0000i 0.0000 + 0.0000i -4.5745 - 1.8929i
0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i
Column 4
0.0000 + 0.0000i
0.0000 + 0.0000i
0.0000 + 0.0000i
-0.3317 + 1.1948i
Second try was with Intel IPP but it seems that it solves only A*V = V*D and support told me that it's not supported anymore.
https://software.intel.com/en-us/node/505270 (list of constructors for Intel IPP)
I got suggestion to move from Intel IPP to MKL. I did it and hit the wall again. I tried to check all algorithms for Eigen but it seems that there are only A*V = V*D problems solved. I was checking lapack95.lib. The list of algorithms used by this library is available there:
https://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_lapack_examples/index.htm#dsyev.htm
Somewhere on the web I could find topic on Mathworks when someone said that managed to solve my problem partially with usage of MKL:
http://jp.mathworks.com/matlabcentral/answers/40050-generalized-eigenvalue-and-eigenvectors-differences-between-matlab-eig-a-b-and-mkl-lapack-dsygv
Person said that he/she used dsygv algorithm but I can't locate anything like that on the web. Maybe it's a typo.
Anyone has any other proposition/idea how can I implement it? Or maybe point my mistake. I'd appreciate that.
EDIT:
In comments I've received a hint that I was using Eigen solver wrong. My A matrix wasn't self-adjoint and my B matrix wasn't positive-definite. I took matrices from program I want to rewrite to C++ (from random iteration) and checked if they meet the requirements. They did:
Rj =
1.0e+02 *
Columns 1 through 3
0.1302 + 0.0000i -0.0153 + 0.0724i 0.0011 - 0.0042i
-0.0153 - 0.0724i 1.2041 + 0.0000i -0.0524 + 0.0377i
0.0011 + 0.0042i -0.0524 - 0.0377i 0.0477 + 0.0000i
-0.0080 - 0.0108i 0.0929 - 0.0115i -0.0055 + 0.0021i
Column 4
-0.0080 + 0.0108i
0.0929 + 0.0115i
-0.0055 - 0.0021i
0.0317 + 0.0000i
Rt =
Columns 1 through 3
4.8156 + 0.0000i -0.3397 + 1.3502i -0.2143 - 0.3593i
-0.3397 - 1.3502i 7.3635 + 0.0000i -0.5539 - 0.5176i
-0.2143 + 0.3593i -0.5539 + 0.5176i 1.7801 + 0.0000i
0.5241 + 0.9105i 0.9514 + 0.6572i -0.7302 + 0.3161i
Column 4
0.5241 - 0.9105i
0.9514 - 0.6572i
-0.7302 - 0.3161i
9.6022 + 0.0000i
As for Rj which is now my A - it is self-adjoint because Rj = Rj' and Rj = ctranspose(Rj). (http://mathworld.wolfram.com/Self-AdjointMatrix.html)
As for Rt which is now my B - it is Positive-Definite what is checked with method linked to me. (http://www.mathworks.com/matlabcentral/answers/101132-how-do-i-determine-if-a-matrix-is-positive-definite-using-matlab). So
>> [~,p] = chol(Rt)
p =
0
I've rewritten matrices manually to C++ and performed eig(A,B) again with matrices meeting requirements:
Matrix4cd x;
Matrix4cd y;
pair<Matrix4cd, Vector4d> result;
x.real()(0,0) = 13.0163601949795;
x.real()(0,1) = -1.53172561296005;
x.real()(0,2) = 0.109594869350436;
x.real()(0,3) = -0.804231869422614;
x.real()(1,0) = -1.53172561296005;
x.real()(1,1) = 120.406645675346;
x.real()(1,2) = -5.23758765476463;
x.real()(1,3) = 9.28686785230169;
x.real()(2,0) = 0.109594869350436;
x.real()(2,1) = -5.23758765476463;
x.real()(2,2) = 4.76648319080400;
x.real()(2,3) = -0.552823839520508;
x.real()(3,0) = -0.804231869422614;
x.real()(3,1) = 9.28686785230169;
x.real()(3,2) = -0.552823839520508;
x.real()(3,3) = 3.16510496622613;
x.imag()(0,0) = -0.00000000000000;
x.imag()(0,1) = 7.23946944213164;
x.imag()(0,2) = 0.419181335323979;
x.imag()(0,3) = 1.08441894337449;
x.imag()(1,0) = -7.23946944213164;
x.imag()(1,1) = -0.00000000000000;
x.imag()(1,2) = 3.76849276970080;
x.imag()(1,3) = 1.14635625342266;
x.imag()(2,0) = 0.419181335323979;
x.imag()(2,1) = -3.76849276970080;
x.imag()(2,2) = -0.00000000000000;
x.imag()(2,3) = 0.205129702522089;
x.imag()(3,0) = -1.08441894337449;
x.imag()(3,1) = -1.14635625342266;
x.imag()(3,2) = 0.205129702522089;
x.imag()(3,3) = -0.00000000000000;
y.real()(0,0) = 4.81562784930907;
y.real()(0,1) = -0.339731222392148;
y.real()(0,2) = -0.214319720979258;
y.real()(0,3) = 0.524107127885349;
y.real()(1,0) = -0.339731222392148;
y.real()(1,1) = 7.36354235698375;
y.real()(1,2) = -0.553927983436786;
y.real()(1,3) = 0.951404408649307;
y.real()(2,0) = -0.214319720979258;
y.real()(2,1) = -0.553927983436786;
y.real()(2,2) = 1.78008768533745;
y.real()(2,3) = -0.730246631850385;
y.real()(3,0) = 0.524107127885349;
y.real()(3,1) = 0.951404408649307;
y.real()(3,2) = -0.730246631850385;
y.real()(3,3) = 9.60215057284395;
y.imag()(0,0) = -0.00000000000000;
y.imag()(0,1) = 1.35016928394966;
y.imag()(0,2) = -0.359262708214312;
y.imag()(0,3) = -0.910512495060186;
y.imag()(1,0) = -1.35016928394966;
y.imag()(1,1) = -0.00000000000000;
y.imag()(1,2) = -0.517616473138836;
y.imag()(1,3) = -0.657235460367660;
y.imag()(2,0) = 0.359262708214312;
y.imag()(2,1) = 0.517616473138836;
y.imag()(2,2) = -0.00000000000000;
y.imag()(2,3) = -0.316090662865005;
y.imag()(3,0) = 0.910512495060186;
y.imag()(3,1) = 0.657235460367660;
y.imag()(3,2) = 0.316090662865005;
y.imag()(3,3) = -0.00000000000000;
result = eig(x,y);
cout << result.first << endl << endl;
cout << result.second << endl << endl;
And the results of C++:
(0.0295948,0.00562174) (-0.253532,0.0138373) (-0.395087,-0.0139696) (-0.0918132,-0.0788735)
(-0.00994614,-0.0213973) (-0.0118322,-0.0445976) (0.00993512,0.0127006) (0.0590018,-0.387949)
(0.0139485,-0.00832193) (0.363694,-0.446652) (-0.319168,0.376483) (-0.234447,-0.0859585)
(0.173697,0.268015) (0.0279387,-0.0103741) (0.0273701,0.0937148) (-0.055169,0.0295393)
0.244233
2.24309
3.24152
18.664
Results of MATLAB:
>> [A,B] = eig(Rj,Rt)
A =
Columns 1 through 3
0.0208 - 0.0218i 0.2425 + 0.0753i -0.1242 + 0.3753i
-0.0234 - 0.0033i -0.0044 + 0.0459i 0.0150 - 0.0060i
0.0006 - 0.0162i -0.4964 + 0.2921i 0.2719 + 0.4119i
0.3194 + 0.0000i -0.0298 + 0.0000i 0.0976 + 0.0000i
Column 4
-0.0437 - 0.1129i
0.2351 - 0.3142i
-0.1661 - 0.1864i
-0.0626 + 0.0000i
B =
0.2442 0 0 0
0 2.2431 0 0
0 0 3.2415 0
0 0 0 18.6640
Eigenvalues are the same! Nice, but why Eigenvectors are not similar at all?

There is no problem here with Eigen.
In fact for the second example run, Matlab and Eigen produced the very same result. Please remember from basic linear algebra that eigenvector are determined up to an arbitrary scaling factor. (I.e. if v is an eigenvector the same holds for alpha*v, where alpha is a non zero complex scalar.)
It is quite common that different linear algebra libraries compute different eigenvectors, but this does not mean that one of the two codes is wrong: it simply means that they choose a different scaling of the eigenvectors.
EDIT
The main problem with exactly replicating the scaling chosen by matlab is that eig(A,B) is a driver routine, which depending from the different properties of A and B may call different libraries/routines, and apply extra steps like balancing the matrices and so on. By quickly inspecting your example, I would say that in this case matlab is enforcing following condition:
all(imag(V(end,:))==0) (the last component of each eigenvector is real)
but not imposing other constraints. This unfortunately means that the scaling is not unique, and probably depends on intermediate results of the generalised eigenvector algorithm used. In this case I'm not able to give you advice on how to exactly replicate matlab: knowledge of the internal working of matlab is required.
As a general remark, in linear algebra usually one does not care too much about eigenvector scaling, since this is usually completely irrelevant for the problem solved, when the eigenvectors are just used as intermediate results.
The only case in which the scaling has to be defined exactly, is when you are going to give a graphic representation of the eigenvalues.

The eigenvector scaling in Matlab seems to be based on normalizing them to 1.0 (ie. the absolute value of the biggest term in each vector is 1.0). In the application I was using it also returns the left eigenvector rather than the more commonly used right eigenvector. This could explain the differences between Matlab and the eigensolvers in Lapack MKL.

Related

Symmetry of the output of fft2 command in matlab

When I run in matlab
fft2([9.36,-3.42,-2.12;-3.42 0.8 3.68 ;-0.6 -3.88 -4.96 ; 2.82 0.52 2.82])
I obtain
1.6000 + 0.0000i 11.4400 + 4.6765i 11.4400 - 4.6765i
13.2600 + 5.1000i 8.8123 + 8.8711i 7.8077 + 4.7489i
-12.8400 + 0.0000i 20.4600 - 4.2955i 20.4600 + 4.2955i
13.2600 - 5.1000i 7.8077 - 4.7489i 8.8123 - 8.8711i
What is the symmetry of this last matrix? I know that in the fft there exists some symmetry used to do not store all the information of the matrix, and I need to know what is the symmetry of the matrices output of the command fft2.
Some of context (I do not if is useful for the question/answer):
I need understand this because the output of the Fast Fourier Transform using fttw3 library I obtain the vector
1.6
13.26 + 5.1i
-12.84
11.44 + 4.6765i
8.8123 + 8.871i
20.46 - 4.2955i
11.44 - 4.6765i
7.8077 + 4.7488i
20.46 + 4.2955i
***
***
***
As you see, are almost all the elements of the initial matrix, but I do not have the last row, and I think that this values could be calculated using some symmetry.
Thanks in advance.

Calculate roots of characteristic equation of symbolic matrix

I have 3 data matrices A,B,C (all 3x3), with which I use the following code to calculate roots of D(p)X = 0
syms p
D = A + B*p + C*(p^2)
solP = double(solve(det(D)))
From this, I get 6 values for solP. But when I try substituting it back into the symbolic matrix D, as follows, I get non-zero values of det(D) sometimes
for i = 1:6
p = solP(i)
det(double(subs(D)) % Should be zero always as we are substituting roots
end
Please help me understand this behaviour.
EDIT ::
Example :
A =
1.0e+11 *
4.8976 7.0936 6.7970
4.4559 7.5469 6.5510
6.4631 2.7603 1.6261
B =
1.0e+11 *
3.9223 7.0605 0.4617
6.5548 0.3183 0.9713
1.7119 2.7692 8.2346
C =
1.0e+11 *
6.9483 0.3445 7.6552
3.1710 4.3874 7.9520
9.5022 3.8156 1.8687
solP =
0.1061 + 0.0000i
1.5311 + 0.0000i
-0.3432 + 0.9356i
-0.3432 - 0.9356i
0.4228 - 0.5465i
0.4228 + 0.5465i
det(D) =
2.2143e+19
-5.4911e+20
-8.6415e+19 + 4.5024e+19i
-8.6415e+19 - 4.5024e+19i
-1.4547e+19 + 9.1135e+19i
-1.4547e+19 - 9.1135e+19i
The problem is related to the relative accuracy of floating point values, typically 1e-16.
The input matrices are of the order 1e+11 - 1e+12, the solution is of the order 1e+0, so the elements of D are also of the order 1e+11 - 1e+12. To calculate a determinant of a 3x3 matrix, one should take products of three matrix elements and add/subtract them. So, each term is of the order of 1e+33 - 1e+36. If you subtract such a values to obtain the determinant, the expected accuracy is in the order of 1e+17 - 1e+20. Indeed, this corresponds with the values you get. Given the relative accuracy, you are not able to reach further to zero.
Note that if you scale your input matrices, i.e. divide it by 1e+11, the solutions are indeed the same, but the determinants are probably more what you would expect.

Vertcat error using ODE45

I am trying to numerically simulate a system using ODE45. I cannot seem to figure out why i'm getting the following error:
Error using vertcat
Dimensions of matrices being concatenated are not consistent.
[t,x] = ode45(#NL_hw3c, [0,20], [1, 2, 3]);
function sys = NL_hw3c(t,x)
sys = [x(2)+ x(1)^2;
x(3)+log10(x(2)^2 + 1);
-3*x(1)-5*x(2)-3*x(3)+4*cos(t)-5*x(1)^2 -3*log10(x(2)^2 + 1) -6*x(1)*x(2) -6*x(1)^3 -(2*x(2)*(x(3)+ log10(x(2)^2 + 1)))/((x(2)^2 + 1)*log(10)) -2*x(1)*x(3) -2*x(1)*log10(x(2)^2 + 1) -2*x(2)^2 -8*x(1)^2*x(2) -6*x(1)^4];
end
Googled and couldn't find a similar solution. Any help would be appreciated.
Thanks
I had to separate each of the variables in your array for it to work:
function s = NL_hw3c(t,x)
s1 = x(2)+ x(1)^2;
s2 = x(3)+log10(x(2)^2 + 1);
s3 = -3*x(1)-5*x(2)-3*x(3)+4*cos(t)-5*x(1)^2 -3*log10(x(2)^2 + 1) -6*x(1)*x(2) -6*x(1)^3 -(2*x(2)*(x(3)+ log10(x(2)^2 + 1)))/((x(2)^2 + 1)*log(10)) -2*x(1)*x(3) -2*x(1)*log10(x(2)^2 + 1) -2*x(2)^2 -8*x(1)^2*x(2) -6*x(1)^4;
s = [s1;s2;s3];
end
I got the following output:
t =
0
0.0018
0.0037
0.0055
0.0074
0.0166
...
...
19.7647
19.8431
19.9216
20.0000
x =
1.0000 2.0000 3.0000
1.0055 2.0067 2.8493
1.0111 2.0131 2.6987
1.0167 2.0192 2.5481
1.0224 2.0251 2.3975
...
...
0.7926 -0.0187 -1.7587
0.8380 -0.1567 -1.7624
0.8781 -0.2928 -1.7534
0.9129 -0.4253 -1.7299
The reason why your function didn't work was because in the last value of your array, the spaces between each part of the expression are interpreted as going into a separate column. Essentially, the first two rows of your matrix consist of 1 element and if you use the last expression exactly as it is, you would be trying to place 9 elements in the last row. I'm assuming you want a 3 x 1 matrix and the last element would thus violate the size of this matrix that you want to create and this is why it's giving you an error.
I'm assuming you want the last value as an entire expression, so to do this you will need to place this as a separate expression then place it into your array.
To make the code more readable, I've placed all of the entries as separate variables before making the array.

Results not right by using Sedumi in Yalmip

I want to use Yalmip in Matlab for solving a sdp problem,
min X11+X13
s.t. X22=1
X is positive semidefinite
Following is the code
P = sdpvar(3,3);
cons = [P >= 0,P(2,2)==1];
options = sdpsettings('Solver','Sedumi');
obj = [P(1,1)+P(1,3)];
solvesdp(cons,obj,options);
PP = double(P)
PP(1,1)+PP(2,3)
results are shown below
PP =
1.2900 0.0000 -2.2900
0.0000 0.0000 0.0000
-2.2900 0.0000 5.8700
ans =
1.2900
I am quite curious about the results, I've had the constraint P(2,2)==1, while in the final results, P(2,2)=5.87, why does this happen? Any one can help?
yalmip assumes symmetrical decision matrix, P = sdpvar(3,3,'full') will do better

Avoiding numerical overflow when calculating the value AND gradient of the Logistic loss function

I am currently trying to implement a machine learning algorithm that involves the logistic loss function in MATLAB. Unfortunately, I am having some trouble due to numerical overflow.
In general, for a given an input s, the value of the logistic function is:
log(1 + exp(s))
and the slope of the logistic loss function is:
exp(s)./(1 + exp(s)) = 1./(1 + exp(-s))
In my algorithm, the value of s = X*beta. Here X is a matrix with N data points and P features per data point (i.e. size(X)=[N,P]) and beta is a vector of P coefficients for each feature such that size(beta)=[P 1].
I am specifically interested in calculating the average value and gradient of the Logistic function for given value of beta.
The average value of the Logistic function w.r.t to a value of beta is:
L = 1/N * sum(log(1+exp(X*beta)),1)
The average value of the slope of the Logistic function w.r.t. to a value of b is:
dL = 1/N * sum((exp(X*beta)./(1+exp(X*beta))' X, 1)'
Note that size(dL) = [P 1].
My issue is that these expressions keep producing numerical overflows. The problem effectively comes from the fact that exp(s)=Inf when s>1000 and exp(s)=0 when s<-1000.
I am looking for a solution such that s can take on any value in floating point arithmetic. Ideally, I would also really appreciate a solution that allows me to evaluate the value and gradient in a vectorized / efficient way.
How about the following approximations:
– For computing L, if s is large, then exp(s) will be much larger than 1:
1 + exp(s) ≅ exp(s)
and consequently
log(1 + exp(s)) ≅ log(exp(s)) = s.
If s is small, then using the Taylor series of exp()
exp(s) ≅ 1 + s
and using the Taylor series of log()
log(1 + exp(s)) ≅ log(2 + s) ≅ log(2) + s / 2.
– For computing dL, for large s
exp(s) ./ (1 + exp(s)) ≅ 1
and for small s
exp(s) ./ (1 + exp(s)) ≅ 1/2 + s / 4.
– The code to compute L could look for example like this:
s = X*beta;
l = log(1+exp(s));
ind = isinf(l);
l(ind) = s(ind);
ind = (l == 0);
l(ind) = log(2) + s(ind) / 2;
L = 1/N * sum(l,1)
I found a good article about this problem.
Cutting through a lot of words, we can simplify the argument to stating that the original expression
log(1 + exp(s))
can be rewritten as
log(exp(s)*(exp(-s) + 1))
= log(exp(s)) + log(exp(-s) + 1)
= s + log(exp(-s) + 1)
This stops overflow from occurring - it doesn't prevent underflow, but by the time that occurs, you have your answer (namely, s). You can't just use this instead of the original, since it will still give you problems. However, we now have the basis for a function that can be written that will be accurate and won't produce over/underflow:
function LL = logistic(s)
if s<0
LL = log(1 + exp(s));
else
LL = s + logistic(-s);
I think this maintains reasonably good accuracy.
EDIT now to the meat of your question - making this vectorized, and allowing the calculation of the slope as well. Let's take these one at a time:
function LL = logisticVec(s)
LL = zeros(size(s));
LL(s<0) = log(1 + exp(s(s<0)));
LL(s>=0) = s(s>=0) + log(1 + exp(-s(s>=0)));
To obtain the average you wanted:
L = logisticVec(X*beta) / N;
The slope is a little bit trickier; note I believe you may have a typo in your expression (missing a multiplication sign).
dL/dbeta = sum(X * exp(X*beta) ./ (1 + exp(X*beta))) / N;
If we divide top and bottom by exp(X*beta) we get
dL = sum(X ./ (exp(-X*beta) + 1)) / N;
Once again, the overflow has gone away and we are left with underflow - but since the underflowed value has 1 added to it, the error this creates is insignificant.