Same optimization algorithm resulting in different coefficients and computation time - matlab

I noticed that when I run an optimization method in MATLAB with a fmincon function, for example active-set method, the resulting coefficients and computation time is different every time I run the algorithm. I do not change the starting points, I just run the method one after another.
I understand this is the case for global optimization algorithms as they are using stochastic algorithms in the background. But why the same thing happens for the local minimum methods?
EDIT: adding the code
options = optimset('fmincon') ;
options = optimset(options,...
'LargeScale','on',...
'Algorithm','active-set',...
'MaxFunEvals',10000,...
'Display','off',...
'TolCon', 1e-10,...
'TolFun', 1e-10,...
'TolX', 1e-10,...
'MaxIter',10000) ;
[Param,fval,exitflag,output] = fmincon(#Cost_FCN, Param0,[],[],[],[],LB,UB,[],options,Fz,Side,Camber,List,Coeff,Fy) ;
1.8181 1.8181
0.1737 0.1737
0.0210 0.0210
0.0004 0.0004
0.0000 0.0000
9.5810 9.5811
4.7975 4.7981
1.5981 1.5991
9.9934 9.9934
0.8277 0.8277
0.0359 0.0360
0.2438 0.2437
0.0125 0.0125
0.0051 0.0051
0.0041 0.0041
Bounds and constraints are constant and the starting points are the same every time I run the method.

Related

Strange result of spap2

I encounter strange results from spap2 on some data:
The actual data is the blue curve, red circles are the knots I am using and yellow curve is the display of the cubic spline curve.
The code is quite simple, I cannot figure out what is the problem:
spgood = spap2(knots_zY, 4, ec, Y);
plot(ec, Y);
hold on;
scatter(knots_zY, Y(ec==knots_zY));
fnplt(spgood)
ec is the vector -4.12:0.02:-0.54.
Y is the following vector:
4.1291 4.0732 4.0173 4.2624 4.3826 4.3267 4.2708 4.4367 4.3808 4.1031 4.1721 3.8152 4.1572
4.1013 4.0454 3.5916 3.8367 3.7808 3.8218 3.6690 3.9141 3.7333 3.8023 3.3204 3.5656 3.4305
3.5787 3.3978 3.3419 3.2860 3.4062 3.4753 3.5706 3.2385 3.1826 3.4947 3.5315 3.1746 3.2089
3.2276 3.1940 2.9162 3.0364 3.0263 2.8155 2.7596 2.9555 2.8996 2.9081 2.7322 2.8524 2.6397
2.7662 2.5279 2.5417 2.2005 2.3409 2.5108 2.5202 2.3359 2.3660 2.3100 2.1682 2.1123 2.2140
2.1288 2.1116 1.9856 2.0089 1.8845 1.9148 1.9308 1.7273 1.7642 1.7326 1.6606 1.7378 1.6570
1.5815 1.5701 1.4630 1.5503 1.5181 1.4385 1.3083 1.3168 1.2991 1.2523 1.1390 0.9988 1.0373
0.9913 1.0113 0.9754 0.8912 0.8790 0.7491 0.7557 0.7544 0.7119 0.7031 0.6843 0.6418 0.5938
0.5193 0.5334 0.4312 0.4839 0.4437 0.3992 0.3689 0.3287 0.3348 0.3076 0.2274 0.2174 0.1970
0.2188 0.1760 0.1384 0.1773 0.1342 0.1388 0.1097 0.0830 0.0782 0.0725 0.0863 0.0581 0.0466
0.0398 0.0431 0.0187 0.0187 0.0176 0.0167 0.0231 0.0033 -0.0117 -0.0016 0.0084 -0.0055 -0.0120
-0.0080 -0.0064 -0.0075 -0.0134 -0.0075 0.0012 -0.0077 -0.0024 0.0006 0.0010 0.0043 0.0016 0.0018
0.0042 0.0030 0.0029 0.0029 0.0021 0.0013 -0.0002 -0.0020 -0.0030 -0.0032 -0.0002 -0.0013 0.0035
0.0028 -0.0000 -0.0057 -0.0032 0.0020 0.0597 0.1835 0.5083 1.0275 1.6448 3.0549
The knots are defined with the following 12 values:
-4.1200 -3.9400 -3.5400 -3.3000 -3.1400 -2.6800 -2.3600 -2.0600 -1.5000 -1.1600 -0.7000 -0.5400
I don't expect a nice fit, but at least the spline fit sticks with the knots ... but here the result is completely erroneous. I am stuck with this, unable to see where is the problem with this data sample.
Note: the knots are computed in a separate algorithm and should be used for the interpolator, getting a good fit is not the question here. The question is why the spline fit does not pass through the knots.
I have made several errors.
First, it's a mistake to assume that the result spline will pass through the knots, as it is an approximation (see this answer). The approximation smoothes the whole original data so there is no way to stick on knots.
Second, I have forgot to extend the end knots to impose boundary conditions. The default boundary condition is to have all derivatives (including the 0th-order) to be zero, resulting in this shape. The solution is then to use augknt to get an actual cubic spline with two continuous derivatives:
spgood = spap2(augknt(knots_zY,4), 4, ec, Y);
The resulting fit is:
which is way better, given the choice of the knot sequence.

Dimensions Reduction in Matlab using PCA

I have a matrix with 35 columns and I'm trying to reduce the dimension using PCA. I run PCA on my data:
[coeff,score,latent,tsquared,explained,mu] = pca(data);
explained =
99.9955
0.0022
0.0007
0.0003
0.0002
0.0001
0.0001
0.0001
Then, by looking at the vector explained, I notice the value of the first element is 99. Based on this, I decided to take only the first compoenet. So I did the follwoing:
k=1;
X = bsxfun(#minus, data, mean(data)) * coeff(:, 1:k);
and now, I used X for SVM training:
svmStruct = fitcsvm(X,Y,'Standardize',true, 'Prior','uniform','KernelFunction','linear','KernelScale','auto','Verbose',0,'IterationLimit', 1000000);
However, when I tried to predict and calculate the miss-classification rate:
[label,score,cost] = predict(svmStruct, X);
the result was disappointing. I notice, when I select only one component (k=1), I all classification was wrong. However, as I increase number of included components, k, the result improves, as you can see from the diagram below. But this doesn't make sense according to explained, which indicates that I should be fine with only the first eigenvector.
Did I do any mistake?
This diagram shows the classification error as a function of the number of included eginvectors:
This graph is generated after by doing normalization before doing PCA as suggested by #zelanix:
This is also plotted graph:
and this explained values obtained after doing normalization before PCA:
>> [coeff,score,latent,tsquared,explained,mu] = pca(data_normalised);
Warning: Columns of X are linearly dependent to within machine precision.
Using only the first 27 components to compute TSQUARED.
> In pca>localTSquared (line 501)
In pca (line 347)
>> explained
explained =
32.9344
15.6790
5.3093
4.7919
4.0905
3.8655
3.0015
2.7216
2.6300
2.5098
2.4275
2.3078
2.2077
2.1726
2.0892
2.0425
2.0273
1.9135
1.8809
1.7055
0.8856
0.3390
0.2204
0.1061
0.0989
0.0334
0.0085
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
Parag S. Chandakkar is absolutely right that there is no reason to expect that PCA will automatically improve your classification result. It is an unsupervised method so is not intended to improve separability, only to find the components with the largest variance.
But there are some other problems with your code. In particular, this line confuses me:
X = bsxfun(#minus, data, mean(data)) * coeff(:, 1:k);
You need to normalise your data before performing PCA, and each feature needs to be normalised separately. I use the following:
data_normalised = data;
for f = 1:size(data, 2)
data_normalised(:, f) = data_normalised(:, f) - nanmean(data_normalised(:, f));
data_normalised(:, f) = data_normalised(:, f) / nanstd(data_normalised(:, f));
end
pca_coeff = pca(data_normalised);
data_pca = data_normalised * pca_coeff;
You can then extract the first principal component as data_pca(:, 1).
Also, always plot your PCA results to get an idea of what is actually going on:
figure
scatter(data_pca(Y == 1, 1), data_pca(Y == 1, 2))
hold on;
scatter(data_pca(Y == 2, 1), data_pca(Y == 2, 2))
PCA gives the direction of maximum variance in the data, it does not necessarily have to do better classification. If you want to reduce your data while trying to maximize your accuracy, you should do LDA.
The following picture illustrates exactly what I want to convey.

how to evaluate curve fitting in Matlab

I'm using Matlab to analyse a couple of data, for I that I need the curve fitting, I've wrote this code from the documentation :
% I is 14 points vector that change its value in a loop
y =0:13;
[p,S] = polyfit(I,y,1);
[fx, delta] = polyval(p,I,S);
plot(y,I,'+',fx,I,'-');
here is what I get :
my question is , how can evaluate this 'fitting', I mean how good it is , and how can I get the slope of this line?
UPDATE
after Rafaeli's answer , I had some trouble understand the results, since fx is the fitting curve fitting for y considering 'I' , meaning that I get for `fx':
-1.0454 3.0800 4.3897 6.5324 4.0947 3.8975 4.3476 9.0088 5.8307 6.7166 9.8243 11.4009 11.9223
instead the I values are :
0.0021 0.0018 0.0017 0.0016 0.0018 0.0018 0.0017 0.0014 0.0016 0.0016 0.0014 0.0012 0.0012 0.0013
and the plot has exactly the value of `I' :
so the result I hope to get should be near to those values ! Itried to switch the
[p,S] = polyfit(y,I,1);
but is didn't the wasn't any better fx= 0.0020,so my question is how can I do that ?
2nd UPDATE
got it, here is the code :
y = 0:13
p = polyfit(y,I,1)
fx = polyval(p,y);
plot(y,I,'+',y,fx,'o')
here is the result :
thanks for any help !
The line is defined by y = ax + b, where a = p(1) and b = p(2), so the slope is p(1).
A simple way to know how good is the fit is to take the root mean square of the error: rms(fx - I). The lesser the value, better the fit.

How to use Matlab for non linear least squares Michaelis–Menten parameters estimation

I have a set of measurements and I started making a linear approximation (as in this plot). A linear least squares estimation of the parameters V_{max} and K_{m} from this code in Matlab:
data=[2.0000 0.0615
2.0000 0.0527
0.6670 0.0334
0.6670 0.0334
0.4000 0.0138
0.4000 0.0258
0.2860 0.0129
0.2860 0.0183
0.2220 0.0083
0.2200 0.0169
0.2000 0.0129
0.2000 0.0087 ];
x = 1./data(:,1);
y = 1./data(:,2);
J = [x,ones(length(x),1)];
k = J\y;
vmax = 1/k(2);
km = k(1)*vmax;
lse = (vmax.*data(:,1))./(km+data(:,1));
plot(data(:,1),data(:,2),'o','color','red','linewidth',1)
line(data(:,1),lse,'linewidth',2)
This yields a fit that looks alright. Next, I wanted to do the same thing but with non-linear least squares. However, the fit always looks wrong, here is the code for that attempt:
options = optimset('MaxIter',10000,'MaxFunEvals',50000,'FunValCheck',...
'on','Algorithm',{'levenberg-marquardt',.00001});
p=lsqnonlin(#myfun,[0.1424,2.5444]);
lse = (p(1).*data(:,1))./(p(2)+data(:,1));
plot(data(:,1),data(:,2),'o','color','red','linewidth',1)
line(data(:,1),lse,'linewidth',2)
which requires this function in an M-File:
function F = myfun(x)
F = data(:,2)-(x(1).*data(:,1))./x(2)+data(:,1);
If you run the code you will see my problem. But hopefully, unlike me, you see what I'm doing wrong.
I think that you forgot some parentheses (some others are superfluous) in your nonlinear function. Using an anonymous function:
myfun = #(x)data(:,2)-x(1).*data(:,1)./(x(2)+data(:,1)); % Parentheses were missing
options = optimset('MaxIter',10000,'MaxFunEvals',50000,'FunValCheck','on',...
'Algorithm',{'levenberg-marquardt',.00001});
p = lsqnonlin(myfun,[0.1424,2.5444],[],[],options);
lse = p(1).*data(:,1)./(p(2)+data(:,1));
plot(data(:,1),data(:,2),'o','color','red','linewidth',1)
line(data(:,1),lse,'linewidth',2)
You also weren't actually applying any of your options.
You might look into using lsqcurvefit instead as it was designed for data fitting problems:
myfun = #(x,dat)x(1).*dat./(x(2)+dat);
options = optimset('MaxIter',10000,'MaxFunEvals',50000,'FunValCheck','on',...
'Algorithm',{'levenberg-marquardt',.00001});
p = lsqcurvefit(myfun,[0.1424,2.5444],data(:,1),data(:,2),[],[],options);
lse = myfun(p,data(:,1));
plot(data(:,1),data(:,2),'o','color','red','linewidth',1)
line(data(:,1),lse,'linewidth',2)

clustering using PCA

I am doing unsupervised classification. For this I have n features for classification and I want to use PCA for projecting data into new subspace and then perform clustering using output of PCA. I have written the following code:
for c=1:size(feature,1)
feature(c,:)=feature(c,:)-mean(feature);
end
DataCov=cov(feature); % covariance matrix
[PC,latent,explained] = pcacov(DataCov);
This gives me :
PC =
0.6706 0.7348 0.0965 0.0316 -0.0003 -0.0001
0.0009 -0.0060 0.0298 0.0378 0.8157 -0.5764
0.0391 -0.1448 0.5661 0.8091 -0.0406 0.0264
0.7403 -0.6543 -0.1461 -0.0505 0.0018 -0.0005
0.0003 -0.0020 0.0193 -0.0116 0.5768 0.8166
0.0264 -0.1047 0.8048 -0.5832 -0.0151 -0.0169
latent =
0.0116
0.0001
0.0000
0.0000
0.0000
0.0000
explained =
98.8872 <-----
1.0445
0.0478
0.0205
0.0000
0.0000
explained shows that only the first component (indicated by <--) really contributes a significant amount to explained variance.
Please reply, Is it possible to create a new features using only first component.???
Following is giving me new feature set, feature_New, using all Principle component. Is this a right way to create new feature set on which I can perform clustering:
feature_New= feature*PC;