Related
Do anyone have any idea how can I rewrite eig(A,B) from Matlab used to calculate generalized eigenvector/eigenvalues? I've been struggling with this problem lately. So far:
Matlab definition of eig function I need:
[V,D] = eig(A,B) produces a diagonal matrix D of generalized
eigenvalues and a full matrix V whose columns are the corresponding
eigenvectors so that A*V = B*V*D.
So far I tried the Eigen library (http://eigen.tuxfamily.org/dox/classEigen_1_1GeneralizedSelfAdjointEigenSolver.html)
My implementation looks like this:
std::pair<Matrix4cd, Vector4d> eig(const Matrix4cd& A, const Matrix4cd& B)
{
Eigen::GeneralizedSelfAdjointEigenSolver<Matrix4cd> solver(A, B);
Matrix4cd V = solver.eigenvectors();
Vector4d D = solver.eigenvalues();
return std::make_pair(V, D);
}
But first thing that comes to my mind is, that I can't use Vector4cd as .eigenvalues() doesn't return complex values where Matlab does. Furthermore results of .eigenvectors() and .eigenvalues() for the same matrices are not the same at all:
C++:
Matrix4cd x;
Matrix4cd y;
pair<Matrix4cd, Vector4d> result;
for (int i = 0; i < 4; i++)
{
for (int j = 0; j < 4; j++)
{
x.real()(i,j) = (double)(i+j+1+i*3);
y.real()(i,j) = (double)(17 - (i+j+1+i*3));
x.imag()(i,j) = (double)(i+j+1+i*3);
y.imag()(i,j) = (double)(17 - (i+j+1+i*3));
}
}
result = eig(x,y);
cout << result.first << endl << endl;
cout << result.second << endl << endl;
Matlab:
for i=1:1:4
for j=1:1:4
x(i,j) = complex((i-1)+(j-1)+1+((i-1)*3), (i-1)+(j-1)+1+((i-1)*3));
y(i,j) = complex(17 - ((i-1)+(j-1)+1+((i-1)*3)), 17 - ((i-1)+(j-1)+1+((i-1)*3)));
end
end
[A,B] = eig(x,y)
So I give eig the same 4x4 matrices holding values 1-16 ascending (x) and descending (y). But I receive different results, furthermore Eigen method returns double from eigenvalues while Matlab returns complex dobule. I also find out that there is other Eigen solver named GeneralizedEigenSolver. That one in the documentation (http://eigen.tuxfamily.org/dox/classEigen_1_1GeneralizedEigenSolver.html) has written that it solves A*V = B*V*D but to be honest I tried it and results (matrix sizes) are not the same size as Matlab so I got quite lost how it works (examplary results are on the website I've linked). It also has only .eigenvector method.
C++ results:
(-0.222268,-0.0108754) (0.0803437,-0.0254809) (0.0383264,-0.0233819) (0.0995482,0.00682079)
(-0.009275,-0.0182668) (-0.0395551,-0.0582127) (0.0550395,0.03434) (-0.034419,-0.0287563)
(-0.112716,-0.0621061) (-0.010788,0.10297) (-0.0820552,0.0294896) (-0.114596,-0.146384)
(0.28873,0.257988) (0.0166259,-0.0529934) (0.0351645,-0.0322988) (0.405394,0.424698)
-1.66983
-0.0733194
0.0386832
3.97933
Matlab results:
[A,B] = eig(x,y)
A =
Columns 1 through 3
-0.9100 + 0.0900i -0.5506 + 0.4494i 0.3614 + 0.3531i
0.7123 + 0.0734i 0.4928 - 0.2586i -0.5663 - 0.4337i
0.0899 - 0.4170i -0.1210 - 0.3087i 0.0484 - 0.1918i
0.1077 + 0.2535i 0.1787 + 0.1179i 0.1565 + 0.2724i
Column 4
-0.3237 - 0.3868i
0.2338 + 0.7662i
0.5036 - 0.3720i
-0.4136 - 0.0074i
B =
Columns 1 through 3
-1.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i
0.0000 + 0.0000i -1.0000 - 0.0000i 0.0000 + 0.0000i
0.0000 + 0.0000i 0.0000 + 0.0000i -4.5745 - 1.8929i
0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i
Column 4
0.0000 + 0.0000i
0.0000 + 0.0000i
0.0000 + 0.0000i
-0.3317 + 1.1948i
Second try was with Intel IPP but it seems that it solves only A*V = V*D and support told me that it's not supported anymore.
https://software.intel.com/en-us/node/505270 (list of constructors for Intel IPP)
I got suggestion to move from Intel IPP to MKL. I did it and hit the wall again. I tried to check all algorithms for Eigen but it seems that there are only A*V = V*D problems solved. I was checking lapack95.lib. The list of algorithms used by this library is available there:
https://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_lapack_examples/index.htm#dsyev.htm
Somewhere on the web I could find topic on Mathworks when someone said that managed to solve my problem partially with usage of MKL:
http://jp.mathworks.com/matlabcentral/answers/40050-generalized-eigenvalue-and-eigenvectors-differences-between-matlab-eig-a-b-and-mkl-lapack-dsygv
Person said that he/she used dsygv algorithm but I can't locate anything like that on the web. Maybe it's a typo.
Anyone has any other proposition/idea how can I implement it? Or maybe point my mistake. I'd appreciate that.
EDIT:
In comments I've received a hint that I was using Eigen solver wrong. My A matrix wasn't self-adjoint and my B matrix wasn't positive-definite. I took matrices from program I want to rewrite to C++ (from random iteration) and checked if they meet the requirements. They did:
Rj =
1.0e+02 *
Columns 1 through 3
0.1302 + 0.0000i -0.0153 + 0.0724i 0.0011 - 0.0042i
-0.0153 - 0.0724i 1.2041 + 0.0000i -0.0524 + 0.0377i
0.0011 + 0.0042i -0.0524 - 0.0377i 0.0477 + 0.0000i
-0.0080 - 0.0108i 0.0929 - 0.0115i -0.0055 + 0.0021i
Column 4
-0.0080 + 0.0108i
0.0929 + 0.0115i
-0.0055 - 0.0021i
0.0317 + 0.0000i
Rt =
Columns 1 through 3
4.8156 + 0.0000i -0.3397 + 1.3502i -0.2143 - 0.3593i
-0.3397 - 1.3502i 7.3635 + 0.0000i -0.5539 - 0.5176i
-0.2143 + 0.3593i -0.5539 + 0.5176i 1.7801 + 0.0000i
0.5241 + 0.9105i 0.9514 + 0.6572i -0.7302 + 0.3161i
Column 4
0.5241 - 0.9105i
0.9514 - 0.6572i
-0.7302 - 0.3161i
9.6022 + 0.0000i
As for Rj which is now my A - it is self-adjoint because Rj = Rj' and Rj = ctranspose(Rj). (http://mathworld.wolfram.com/Self-AdjointMatrix.html)
As for Rt which is now my B - it is Positive-Definite what is checked with method linked to me. (http://www.mathworks.com/matlabcentral/answers/101132-how-do-i-determine-if-a-matrix-is-positive-definite-using-matlab). So
>> [~,p] = chol(Rt)
p =
0
I've rewritten matrices manually to C++ and performed eig(A,B) again with matrices meeting requirements:
Matrix4cd x;
Matrix4cd y;
pair<Matrix4cd, Vector4d> result;
x.real()(0,0) = 13.0163601949795;
x.real()(0,1) = -1.53172561296005;
x.real()(0,2) = 0.109594869350436;
x.real()(0,3) = -0.804231869422614;
x.real()(1,0) = -1.53172561296005;
x.real()(1,1) = 120.406645675346;
x.real()(1,2) = -5.23758765476463;
x.real()(1,3) = 9.28686785230169;
x.real()(2,0) = 0.109594869350436;
x.real()(2,1) = -5.23758765476463;
x.real()(2,2) = 4.76648319080400;
x.real()(2,3) = -0.552823839520508;
x.real()(3,0) = -0.804231869422614;
x.real()(3,1) = 9.28686785230169;
x.real()(3,2) = -0.552823839520508;
x.real()(3,3) = 3.16510496622613;
x.imag()(0,0) = -0.00000000000000;
x.imag()(0,1) = 7.23946944213164;
x.imag()(0,2) = 0.419181335323979;
x.imag()(0,3) = 1.08441894337449;
x.imag()(1,0) = -7.23946944213164;
x.imag()(1,1) = -0.00000000000000;
x.imag()(1,2) = 3.76849276970080;
x.imag()(1,3) = 1.14635625342266;
x.imag()(2,0) = 0.419181335323979;
x.imag()(2,1) = -3.76849276970080;
x.imag()(2,2) = -0.00000000000000;
x.imag()(2,3) = 0.205129702522089;
x.imag()(3,0) = -1.08441894337449;
x.imag()(3,1) = -1.14635625342266;
x.imag()(3,2) = 0.205129702522089;
x.imag()(3,3) = -0.00000000000000;
y.real()(0,0) = 4.81562784930907;
y.real()(0,1) = -0.339731222392148;
y.real()(0,2) = -0.214319720979258;
y.real()(0,3) = 0.524107127885349;
y.real()(1,0) = -0.339731222392148;
y.real()(1,1) = 7.36354235698375;
y.real()(1,2) = -0.553927983436786;
y.real()(1,3) = 0.951404408649307;
y.real()(2,0) = -0.214319720979258;
y.real()(2,1) = -0.553927983436786;
y.real()(2,2) = 1.78008768533745;
y.real()(2,3) = -0.730246631850385;
y.real()(3,0) = 0.524107127885349;
y.real()(3,1) = 0.951404408649307;
y.real()(3,2) = -0.730246631850385;
y.real()(3,3) = 9.60215057284395;
y.imag()(0,0) = -0.00000000000000;
y.imag()(0,1) = 1.35016928394966;
y.imag()(0,2) = -0.359262708214312;
y.imag()(0,3) = -0.910512495060186;
y.imag()(1,0) = -1.35016928394966;
y.imag()(1,1) = -0.00000000000000;
y.imag()(1,2) = -0.517616473138836;
y.imag()(1,3) = -0.657235460367660;
y.imag()(2,0) = 0.359262708214312;
y.imag()(2,1) = 0.517616473138836;
y.imag()(2,2) = -0.00000000000000;
y.imag()(2,3) = -0.316090662865005;
y.imag()(3,0) = 0.910512495060186;
y.imag()(3,1) = 0.657235460367660;
y.imag()(3,2) = 0.316090662865005;
y.imag()(3,3) = -0.00000000000000;
result = eig(x,y);
cout << result.first << endl << endl;
cout << result.second << endl << endl;
And the results of C++:
(0.0295948,0.00562174) (-0.253532,0.0138373) (-0.395087,-0.0139696) (-0.0918132,-0.0788735)
(-0.00994614,-0.0213973) (-0.0118322,-0.0445976) (0.00993512,0.0127006) (0.0590018,-0.387949)
(0.0139485,-0.00832193) (0.363694,-0.446652) (-0.319168,0.376483) (-0.234447,-0.0859585)
(0.173697,0.268015) (0.0279387,-0.0103741) (0.0273701,0.0937148) (-0.055169,0.0295393)
0.244233
2.24309
3.24152
18.664
Results of MATLAB:
>> [A,B] = eig(Rj,Rt)
A =
Columns 1 through 3
0.0208 - 0.0218i 0.2425 + 0.0753i -0.1242 + 0.3753i
-0.0234 - 0.0033i -0.0044 + 0.0459i 0.0150 - 0.0060i
0.0006 - 0.0162i -0.4964 + 0.2921i 0.2719 + 0.4119i
0.3194 + 0.0000i -0.0298 + 0.0000i 0.0976 + 0.0000i
Column 4
-0.0437 - 0.1129i
0.2351 - 0.3142i
-0.1661 - 0.1864i
-0.0626 + 0.0000i
B =
0.2442 0 0 0
0 2.2431 0 0
0 0 3.2415 0
0 0 0 18.6640
Eigenvalues are the same! Nice, but why Eigenvectors are not similar at all?
There is no problem here with Eigen.
In fact for the second example run, Matlab and Eigen produced the very same result. Please remember from basic linear algebra that eigenvector are determined up to an arbitrary scaling factor. (I.e. if v is an eigenvector the same holds for alpha*v, where alpha is a non zero complex scalar.)
It is quite common that different linear algebra libraries compute different eigenvectors, but this does not mean that one of the two codes is wrong: it simply means that they choose a different scaling of the eigenvectors.
EDIT
The main problem with exactly replicating the scaling chosen by matlab is that eig(A,B) is a driver routine, which depending from the different properties of A and B may call different libraries/routines, and apply extra steps like balancing the matrices and so on. By quickly inspecting your example, I would say that in this case matlab is enforcing following condition:
all(imag(V(end,:))==0) (the last component of each eigenvector is real)
but not imposing other constraints. This unfortunately means that the scaling is not unique, and probably depends on intermediate results of the generalised eigenvector algorithm used. In this case I'm not able to give you advice on how to exactly replicate matlab: knowledge of the internal working of matlab is required.
As a general remark, in linear algebra usually one does not care too much about eigenvector scaling, since this is usually completely irrelevant for the problem solved, when the eigenvectors are just used as intermediate results.
The only case in which the scaling has to be defined exactly, is when you are going to give a graphic representation of the eigenvalues.
The eigenvector scaling in Matlab seems to be based on normalizing them to 1.0 (ie. the absolute value of the biggest term in each vector is 1.0). In the application I was using it also returns the left eigenvector rather than the more commonly used right eigenvector. This could explain the differences between Matlab and the eigensolvers in Lapack MKL.
I have a matrix with 35 columns and I'm trying to reduce the dimension using PCA. I run PCA on my data:
[coeff,score,latent,tsquared,explained,mu] = pca(data);
explained =
99.9955
0.0022
0.0007
0.0003
0.0002
0.0001
0.0001
0.0001
Then, by looking at the vector explained, I notice the value of the first element is 99. Based on this, I decided to take only the first compoenet. So I did the follwoing:
k=1;
X = bsxfun(#minus, data, mean(data)) * coeff(:, 1:k);
and now, I used X for SVM training:
svmStruct = fitcsvm(X,Y,'Standardize',true, 'Prior','uniform','KernelFunction','linear','KernelScale','auto','Verbose',0,'IterationLimit', 1000000);
However, when I tried to predict and calculate the miss-classification rate:
[label,score,cost] = predict(svmStruct, X);
the result was disappointing. I notice, when I select only one component (k=1), I all classification was wrong. However, as I increase number of included components, k, the result improves, as you can see from the diagram below. But this doesn't make sense according to explained, which indicates that I should be fine with only the first eigenvector.
Did I do any mistake?
This diagram shows the classification error as a function of the number of included eginvectors:
This graph is generated after by doing normalization before doing PCA as suggested by #zelanix:
This is also plotted graph:
and this explained values obtained after doing normalization before PCA:
>> [coeff,score,latent,tsquared,explained,mu] = pca(data_normalised);
Warning: Columns of X are linearly dependent to within machine precision.
Using only the first 27 components to compute TSQUARED.
> In pca>localTSquared (line 501)
In pca (line 347)
>> explained
explained =
32.9344
15.6790
5.3093
4.7919
4.0905
3.8655
3.0015
2.7216
2.6300
2.5098
2.4275
2.3078
2.2077
2.1726
2.0892
2.0425
2.0273
1.9135
1.8809
1.7055
0.8856
0.3390
0.2204
0.1061
0.0989
0.0334
0.0085
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
Parag S. Chandakkar is absolutely right that there is no reason to expect that PCA will automatically improve your classification result. It is an unsupervised method so is not intended to improve separability, only to find the components with the largest variance.
But there are some other problems with your code. In particular, this line confuses me:
X = bsxfun(#minus, data, mean(data)) * coeff(:, 1:k);
You need to normalise your data before performing PCA, and each feature needs to be normalised separately. I use the following:
data_normalised = data;
for f = 1:size(data, 2)
data_normalised(:, f) = data_normalised(:, f) - nanmean(data_normalised(:, f));
data_normalised(:, f) = data_normalised(:, f) / nanstd(data_normalised(:, f));
end
pca_coeff = pca(data_normalised);
data_pca = data_normalised * pca_coeff;
You can then extract the first principal component as data_pca(:, 1).
Also, always plot your PCA results to get an idea of what is actually going on:
figure
scatter(data_pca(Y == 1, 1), data_pca(Y == 1, 2))
hold on;
scatter(data_pca(Y == 2, 1), data_pca(Y == 2, 2))
PCA gives the direction of maximum variance in the data, it does not necessarily have to do better classification. If you want to reduce your data while trying to maximize your accuracy, you should do LDA.
The following picture illustrates exactly what I want to convey.
I'm using Matlab to analyse a couple of data, for I that I need the curve fitting, I've wrote this code from the documentation :
% I is 14 points vector that change its value in a loop
y =0:13;
[p,S] = polyfit(I,y,1);
[fx, delta] = polyval(p,I,S);
plot(y,I,'+',fx,I,'-');
here is what I get :
my question is , how can evaluate this 'fitting', I mean how good it is , and how can I get the slope of this line?
UPDATE
after Rafaeli's answer , I had some trouble understand the results, since fx is the fitting curve fitting for y considering 'I' , meaning that I get for `fx':
-1.0454 3.0800 4.3897 6.5324 4.0947 3.8975 4.3476 9.0088 5.8307 6.7166 9.8243 11.4009 11.9223
instead the I values are :
0.0021 0.0018 0.0017 0.0016 0.0018 0.0018 0.0017 0.0014 0.0016 0.0016 0.0014 0.0012 0.0012 0.0013
and the plot has exactly the value of `I' :
so the result I hope to get should be near to those values ! Itried to switch the
[p,S] = polyfit(y,I,1);
but is didn't the wasn't any better fx= 0.0020,so my question is how can I do that ?
2nd UPDATE
got it, here is the code :
y = 0:13
p = polyfit(y,I,1)
fx = polyval(p,y);
plot(y,I,'+',y,fx,'o')
here is the result :
thanks for any help !
The line is defined by y = ax + b, where a = p(1) and b = p(2), so the slope is p(1).
A simple way to know how good is the fit is to take the root mean square of the error: rms(fx - I). The lesser the value, better the fit.
I have a set of measurements and I started making a linear approximation (as in this plot). A linear least squares estimation of the parameters V_{max} and K_{m} from this code in Matlab:
data=[2.0000 0.0615
2.0000 0.0527
0.6670 0.0334
0.6670 0.0334
0.4000 0.0138
0.4000 0.0258
0.2860 0.0129
0.2860 0.0183
0.2220 0.0083
0.2200 0.0169
0.2000 0.0129
0.2000 0.0087 ];
x = 1./data(:,1);
y = 1./data(:,2);
J = [x,ones(length(x),1)];
k = J\y;
vmax = 1/k(2);
km = k(1)*vmax;
lse = (vmax.*data(:,1))./(km+data(:,1));
plot(data(:,1),data(:,2),'o','color','red','linewidth',1)
line(data(:,1),lse,'linewidth',2)
This yields a fit that looks alright. Next, I wanted to do the same thing but with non-linear least squares. However, the fit always looks wrong, here is the code for that attempt:
options = optimset('MaxIter',10000,'MaxFunEvals',50000,'FunValCheck',...
'on','Algorithm',{'levenberg-marquardt',.00001});
p=lsqnonlin(#myfun,[0.1424,2.5444]);
lse = (p(1).*data(:,1))./(p(2)+data(:,1));
plot(data(:,1),data(:,2),'o','color','red','linewidth',1)
line(data(:,1),lse,'linewidth',2)
which requires this function in an M-File:
function F = myfun(x)
F = data(:,2)-(x(1).*data(:,1))./x(2)+data(:,1);
If you run the code you will see my problem. But hopefully, unlike me, you see what I'm doing wrong.
I think that you forgot some parentheses (some others are superfluous) in your nonlinear function. Using an anonymous function:
myfun = #(x)data(:,2)-x(1).*data(:,1)./(x(2)+data(:,1)); % Parentheses were missing
options = optimset('MaxIter',10000,'MaxFunEvals',50000,'FunValCheck','on',...
'Algorithm',{'levenberg-marquardt',.00001});
p = lsqnonlin(myfun,[0.1424,2.5444],[],[],options);
lse = p(1).*data(:,1)./(p(2)+data(:,1));
plot(data(:,1),data(:,2),'o','color','red','linewidth',1)
line(data(:,1),lse,'linewidth',2)
You also weren't actually applying any of your options.
You might look into using lsqcurvefit instead as it was designed for data fitting problems:
myfun = #(x,dat)x(1).*dat./(x(2)+dat);
options = optimset('MaxIter',10000,'MaxFunEvals',50000,'FunValCheck','on',...
'Algorithm',{'levenberg-marquardt',.00001});
p = lsqcurvefit(myfun,[0.1424,2.5444],data(:,1),data(:,2),[],[],options);
lse = myfun(p,data(:,1));
plot(data(:,1),data(:,2),'o','color','red','linewidth',1)
line(data(:,1),lse,'linewidth',2)
I am doing unsupervised classification. For this I have n features for classification and I want to use PCA for projecting data into new subspace and then perform clustering using output of PCA. I have written the following code:
for c=1:size(feature,1)
feature(c,:)=feature(c,:)-mean(feature);
end
DataCov=cov(feature); % covariance matrix
[PC,latent,explained] = pcacov(DataCov);
This gives me :
PC =
0.6706 0.7348 0.0965 0.0316 -0.0003 -0.0001
0.0009 -0.0060 0.0298 0.0378 0.8157 -0.5764
0.0391 -0.1448 0.5661 0.8091 -0.0406 0.0264
0.7403 -0.6543 -0.1461 -0.0505 0.0018 -0.0005
0.0003 -0.0020 0.0193 -0.0116 0.5768 0.8166
0.0264 -0.1047 0.8048 -0.5832 -0.0151 -0.0169
latent =
0.0116
0.0001
0.0000
0.0000
0.0000
0.0000
explained =
98.8872 <-----
1.0445
0.0478
0.0205
0.0000
0.0000
explained shows that only the first component (indicated by <--) really contributes a significant amount to explained variance.
Please reply, Is it possible to create a new features using only first component.???
Following is giving me new feature set, feature_New, using all Principle component. Is this a right way to create new feature set on which I can perform clustering:
feature_New= feature*PC;