Strange result of spap2 - matlab

I encounter strange results from spap2 on some data:
The actual data is the blue curve, red circles are the knots I am using and yellow curve is the display of the cubic spline curve.
The code is quite simple, I cannot figure out what is the problem:
spgood = spap2(knots_zY, 4, ec, Y);
plot(ec, Y);
hold on;
scatter(knots_zY, Y(ec==knots_zY));
fnplt(spgood)
ec is the vector -4.12:0.02:-0.54.
Y is the following vector:
4.1291 4.0732 4.0173 4.2624 4.3826 4.3267 4.2708 4.4367 4.3808 4.1031 4.1721 3.8152 4.1572
4.1013 4.0454 3.5916 3.8367 3.7808 3.8218 3.6690 3.9141 3.7333 3.8023 3.3204 3.5656 3.4305
3.5787 3.3978 3.3419 3.2860 3.4062 3.4753 3.5706 3.2385 3.1826 3.4947 3.5315 3.1746 3.2089
3.2276 3.1940 2.9162 3.0364 3.0263 2.8155 2.7596 2.9555 2.8996 2.9081 2.7322 2.8524 2.6397
2.7662 2.5279 2.5417 2.2005 2.3409 2.5108 2.5202 2.3359 2.3660 2.3100 2.1682 2.1123 2.2140
2.1288 2.1116 1.9856 2.0089 1.8845 1.9148 1.9308 1.7273 1.7642 1.7326 1.6606 1.7378 1.6570
1.5815 1.5701 1.4630 1.5503 1.5181 1.4385 1.3083 1.3168 1.2991 1.2523 1.1390 0.9988 1.0373
0.9913 1.0113 0.9754 0.8912 0.8790 0.7491 0.7557 0.7544 0.7119 0.7031 0.6843 0.6418 0.5938
0.5193 0.5334 0.4312 0.4839 0.4437 0.3992 0.3689 0.3287 0.3348 0.3076 0.2274 0.2174 0.1970
0.2188 0.1760 0.1384 0.1773 0.1342 0.1388 0.1097 0.0830 0.0782 0.0725 0.0863 0.0581 0.0466
0.0398 0.0431 0.0187 0.0187 0.0176 0.0167 0.0231 0.0033 -0.0117 -0.0016 0.0084 -0.0055 -0.0120
-0.0080 -0.0064 -0.0075 -0.0134 -0.0075 0.0012 -0.0077 -0.0024 0.0006 0.0010 0.0043 0.0016 0.0018
0.0042 0.0030 0.0029 0.0029 0.0021 0.0013 -0.0002 -0.0020 -0.0030 -0.0032 -0.0002 -0.0013 0.0035
0.0028 -0.0000 -0.0057 -0.0032 0.0020 0.0597 0.1835 0.5083 1.0275 1.6448 3.0549
The knots are defined with the following 12 values:
-4.1200 -3.9400 -3.5400 -3.3000 -3.1400 -2.6800 -2.3600 -2.0600 -1.5000 -1.1600 -0.7000 -0.5400
I don't expect a nice fit, but at least the spline fit sticks with the knots ... but here the result is completely erroneous. I am stuck with this, unable to see where is the problem with this data sample.
Note: the knots are computed in a separate algorithm and should be used for the interpolator, getting a good fit is not the question here. The question is why the spline fit does not pass through the knots.

I have made several errors.
First, it's a mistake to assume that the result spline will pass through the knots, as it is an approximation (see this answer). The approximation smoothes the whole original data so there is no way to stick on knots.
Second, I have forgot to extend the end knots to impose boundary conditions. The default boundary condition is to have all derivatives (including the 0th-order) to be zero, resulting in this shape. The solution is then to use augknt to get an actual cubic spline with two continuous derivatives:
spgood = spap2(augknt(knots_zY,4), 4, ec, Y);
The resulting fit is:
which is way better, given the choice of the knot sequence.

Related

Same optimization algorithm resulting in different coefficients and computation time

I noticed that when I run an optimization method in MATLAB with a fmincon function, for example active-set method, the resulting coefficients and computation time is different every time I run the algorithm. I do not change the starting points, I just run the method one after another.
I understand this is the case for global optimization algorithms as they are using stochastic algorithms in the background. But why the same thing happens for the local minimum methods?
EDIT: adding the code
options = optimset('fmincon') ;
options = optimset(options,...
'LargeScale','on',...
'Algorithm','active-set',...
'MaxFunEvals',10000,...
'Display','off',...
'TolCon', 1e-10,...
'TolFun', 1e-10,...
'TolX', 1e-10,...
'MaxIter',10000) ;
[Param,fval,exitflag,output] = fmincon(#Cost_FCN, Param0,[],[],[],[],LB,UB,[],options,Fz,Side,Camber,List,Coeff,Fy) ;
1.8181 1.8181
0.1737 0.1737
0.0210 0.0210
0.0004 0.0004
0.0000 0.0000
9.5810 9.5811
4.7975 4.7981
1.5981 1.5991
9.9934 9.9934
0.8277 0.8277
0.0359 0.0360
0.2438 0.2437
0.0125 0.0125
0.0051 0.0051
0.0041 0.0041
Bounds and constraints are constant and the starting points are the same every time I run the method.

how to evaluate curve fitting in Matlab

I'm using Matlab to analyse a couple of data, for I that I need the curve fitting, I've wrote this code from the documentation :
% I is 14 points vector that change its value in a loop
y =0:13;
[p,S] = polyfit(I,y,1);
[fx, delta] = polyval(p,I,S);
plot(y,I,'+',fx,I,'-');
here is what I get :
my question is , how can evaluate this 'fitting', I mean how good it is , and how can I get the slope of this line?
UPDATE
after Rafaeli's answer , I had some trouble understand the results, since fx is the fitting curve fitting for y considering 'I' , meaning that I get for `fx':
-1.0454 3.0800 4.3897 6.5324 4.0947 3.8975 4.3476 9.0088 5.8307 6.7166 9.8243 11.4009 11.9223
instead the I values are :
0.0021 0.0018 0.0017 0.0016 0.0018 0.0018 0.0017 0.0014 0.0016 0.0016 0.0014 0.0012 0.0012 0.0013
and the plot has exactly the value of `I' :
so the result I hope to get should be near to those values ! Itried to switch the
[p,S] = polyfit(y,I,1);
but is didn't the wasn't any better fx= 0.0020,so my question is how can I do that ?
2nd UPDATE
got it, here is the code :
y = 0:13
p = polyfit(y,I,1)
fx = polyval(p,y);
plot(y,I,'+',y,fx,'o')
here is the result :
thanks for any help !
The line is defined by y = ax + b, where a = p(1) and b = p(2), so the slope is p(1).
A simple way to know how good is the fit is to take the root mean square of the error: rms(fx - I). The lesser the value, better the fit.

weighted correlation matrix

in general i know that i can easily calculate correlation matrix in matlab,there is a lot of function for this,but what about weighted correlation?i found this matlab file
http://www.mathworks.com/matlabcentral/fileexchange/20846-weighted-correlation-matrix/content/weightedcorrs.m
but how does choosing weights depend on persons intuition or it is standard?
let say we have
x = randn(30,4)
x =
0.5377 0.8884 -1.0891 -1.1480
1.8339 -1.1471 0.0326 0.1049
-2.2588 -1.0689 0.5525 0.7223
0.8622 -0.8095 1.1006 2.5855
0.3188 -2.9443 1.5442 -0.6669
-1.3077 1.4384 0.0859 0.1873
-0.4336 0.3252 -1.4916 -0.0825
0.3426 -0.7549 -0.7423 -1.9330
3.5784 1.3703 -1.0616 -0.4390
2.7694 -1.7115 2.3505 -1.7947
-1.3499 -0.1022 -0.6156 0.8404
3.0349 -0.2414 0.7481 -0.8880
0.7254 0.3192 -0.1924 0.1001
-0.0631 0.3129 0.8886 -0.5445
0.7147 -0.8649 -0.7648 0.3035
-0.2050 -0.0301 -1.4023 -0.6003
-0.1241 -0.1649 -1.4224 0.4900
1.4897 0.6277 0.4882 0.7394
1.4090 1.0933 -0.1774 1.7119
1.4172 1.1093 -0.1961 -0.1941
0.6715 -0.8637 1.4193 -2.1384
-1.2075 0.0774 0.2916 -0.8396
0.7172 -1.2141 0.1978 1.3546
1.6302 -1.1135 1.5877 -1.0722
0.4889 -0.0068 -0.8045 0.9610
1.0347 1.5326 0.6966 0.1240
0.7269 -0.7697 0.8351 1.4367
-0.3034 0.3714 -0.2437 -1.9609
0.2939 -0.2256 0.2157 -0.1977
-0.7873 1.1174 -1.1658 -1.2078
and we have done
x(:,4) = sum(x,2); % Introduce correlation.
[r,p] = corrcoef(x) % Compute sample correlation and p-values.
and got
r =
1.0000 -0.0352 0.2673 0.6901
-0.0352 1.0000 -0.5101 0.2617
0.2673 -0.5101 1.0000 0.3504
0.6901 0.2617 0.3504 1.0000
it is unweighted correlation,but how can i do weighted correlation with help of matlab file?please help me
This function needs the weights of each observation as input. How you choose them is upto you.
If these were outputs of a simulation for example, you could let the weights be the number of performed iterations. If they were stock results, consider using the value in the portfolio. However, there is no standard way to get the 'best' weights in general. Just consider that a value that is more reliable should typically get more weight.

clustering using PCA

I am doing unsupervised classification. For this I have n features for classification and I want to use PCA for projecting data into new subspace and then perform clustering using output of PCA. I have written the following code:
for c=1:size(feature,1)
feature(c,:)=feature(c,:)-mean(feature);
end
DataCov=cov(feature); % covariance matrix
[PC,latent,explained] = pcacov(DataCov);
This gives me :
PC =
0.6706 0.7348 0.0965 0.0316 -0.0003 -0.0001
0.0009 -0.0060 0.0298 0.0378 0.8157 -0.5764
0.0391 -0.1448 0.5661 0.8091 -0.0406 0.0264
0.7403 -0.6543 -0.1461 -0.0505 0.0018 -0.0005
0.0003 -0.0020 0.0193 -0.0116 0.5768 0.8166
0.0264 -0.1047 0.8048 -0.5832 -0.0151 -0.0169
latent =
0.0116
0.0001
0.0000
0.0000
0.0000
0.0000
explained =
98.8872 <-----
1.0445
0.0478
0.0205
0.0000
0.0000
explained shows that only the first component (indicated by <--) really contributes a significant amount to explained variance.
Please reply, Is it possible to create a new features using only first component.???
Following is giving me new feature set, feature_New, using all Principle component. Is this a right way to create new feature set on which I can perform clustering:
feature_New= feature*PC;

Finding more than one maximum in an array

I would like to find more than one maximum value from an array using Matlab.
Here is my code that returns only one max and its position:
[peak, pos] = max(abs(coeffs));
Problem is that I want to detect more than one max in the array. In fact, I would need to detect the first two peaks and their positions in the following array:
>> abs(coeffs())
ans =
0.5442
0.5465
0.5545
0.5674
0.5862
0.6115
0.6438
0.6836
0.7333
0.7941
0.8689
0.9608
1.0751
1.2188
1.4027
1.6441
1.9701
2.4299
3.1178
4.2428
6.3792
11.8611
53.7537
24.9119
10.8982
7.3470
5.7768
4.9340
4.4489
4.1772
4.0564
4.0622
4.1949
4.4801
4.9825
5.8496
7.4614
11.1087
25.6071
53.2831
12.0029
6.4743
4.3096
3.1648
2.4631
1.9918
1.6558
1.4054
1.2129
1.0608
0.9379
0.8371
0.7532
0.6827
0.6224
0.5702
0.5255
0.4861
0.4517
0.4212
0.3941
0.3698
0.3481
0.3282
0.3105
0.2946
0.2796
0.2665
0.2541
0.2429
0.2326
0.2230
0.2141
0.2057
0.1986
0.1914
0.1848
0.1787
0.1729
0.1677
0.1627
0.1579
0.1537
0.1494
0.1456
0.1420
0.1385
0.1353
0.1323
0.1293
0.1267
0.1239
0.1216
0.1192
0.1172
0.1151
0.1132
0.1113
0.1096
0.1080
0.1064
0.1048
0.1038
0.1024
0.1011
0.1000
0.0987
0.0978
0.0967
0.0961
0.0951
0.0943
0.0936
0.0930
0.0924
0.0917
0.0913
0.0908
0.0902
0.0899
0.0894
0.0892
0.0889
0.0888
0.0885
0.0883
0.0882
0.0883
0.0882
0.0883
0.0882
0.0883
0.0885
0.0888
0.0889
0.0892
0.0894
0.0899
0.0902
0.0908
0.0913
0.0917
0.0924
0.0930
0.0936
0.0943
0.0951
0.0961
0.0967
0.0978
0.0987
0.1000
0.1011
0.1024
0.1038
0.1048
0.1064
0.1080
0.1096
0.1113
0.1132
0.1151
0.1172
0.1192
0.1216
0.1239
0.1267
0.1293
0.1323
0.1353
0.1385
0.1420
0.1456
0.1494
0.1537
0.1579
0.1627
0.1677
0.1729
0.1787
0.1848
0.1914
0.1986
0.2057
0.2141
0.2230
0.2326
0.2429
0.2541
0.2665
0.2796
0.2946
0.3105
0.3282
0.3481
0.3698
0.3941
0.4212
0.4517
0.4861
0.5255
0.5702
0.6224
0.6827
0.7532
0.8371
0.9379
1.0608
1.2129
1.4054
1.6558
1.9918
2.4631
3.1648
4.3096
6.4743
12.0029
53.2831
25.6071
11.1087
7.4614
5.8496
4.9825
4.4801
4.1949
4.0622
4.0564
4.1772
4.4489
4.9340
5.7768
7.3470
10.8982
24.9119
53.7537
11.8611
6.3792
4.2428
3.1178
2.4299
1.9701
1.6441
1.4027
1.2188
1.0751
0.9608
0.8689
0.7941
0.7333
0.6836
0.6438
0.6115
0.5862
0.5674
0.5545
0.5465
The reason I need only the two first max values is that the two last ones are reflections of the two first ones as a result of a fast fourier transform.
you can use many peak finding tools to do that. Here's some of them:
Findpeaks
The function [pks,locs] = findpeaks(data) returns local maxima or peaks, pks, in the input data at locations locs (sorted from first to last found). Data requires a row or column vector with real-valued elements with a minimum length of three. findpeaks compares each element of data to its neighboring values. If an element of data is larger than both of its neighbors or equals Inf, the element is a local peak. If there are no local maxima, pks is an empty vector.
For example:
[pks,locs] = findpeaks(abs(coeffs))
plot(abs(coeffs)); hold on
plot(locs(1:2),pks(1:2),'ro');
1D Non-derivative Peak Finder - a FEX tool that finds peaks without taking first or second derivatives, rather it uses local slope features in a given data set.
PeakFinder - another peak finder from the FEX by nate yoder.
and there are plenty more of these in the FEX...