least squares with seasonal component in matlab - matlab

I was reading a paper which looked at investigating trends in monthly wind speed data for the past 20 years or so. The paper uses a number of different statistical approaches, which I am trying to replicate here.
The first method used is a simple linear regression model of the form
$$ y(t) = a_{1}t + b_{1} $$
where $a_{1}$ and $b_{1}$ can be determined by standard least squares.
Then they specify that some of the potential error in the linear regression model can be removed explicitly by accounting for the seasonal signal by fitting a model of the form:
$$ y(t) = a_{2}t + b_{2}\sin\left(\frac{2\pi}{12t} + c_{2}\right) + d_{2}$$
where coefficients $a_{2}$, $b_{2}$, $c_{2}$, and $d_{2}$ can be determined by least squares. They then go on to specify that this model was also tested with additional harmonic components of 3, 4, and 6 months.
Using the following data as an example:
% 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960
y = [112 115 145 171 196 204 242 284 315 340 360 417 % Jan
118 126 150 180 196 188 233 277 301 318 342 391 % Feb
132 141 178 193 236 235 267 317 356 362 406 419 % Mar
129 135 163 181 235 227 269 313 348 348 396 461 % Apr
121 125 172 183 229 234 270 318 355 363 420 472 % May
135 149 178 218 243 264 315 374 422 435 472 535 % Jun
148 170 199 230 264 302 364 413 465 491 548 622 % Jul
148 170 199 242 272 293 347 405 467 505 559 606 % Aug
136 158 184 209 237 259 312 355 404 404 463 508 % Sep
119 133 162 191 211 229 274 306 347 359 407 461 % Oct
104 114 146 172 180 203 237 271 305 310 362 390 % Nov
118 140 166 194 201 229 278 306 336 337 405 432 ]; % Dec
time = datestr(datenum(yr(:),mo(:),1));
jday = datenum(time,'dd-mmm-yyyy');
y2 = reshape(y,[],1);
plot(jday,y2)
Can anyone demonstrate how the model above can be written in matlab?

Notice that your model is actually linear, we can use a trigonometric identity to show that. To use a nonlinear model use nlinfit.
Using your data I wrote the following script to compute and compare the different methods:
(you can comment out the opts.RobustWgtFun = 'bisquare'; line to see that it's exactly like the linear fit with the 12 periodicity)
% y = [112 115 ...
y2 = reshape(y,[],1);
t=(1:144).';
% trend
T = [ones(size(t)) t];
B=T\y2;
y_trend = T*B;
% least squeare, using linear fit and the 12 periodicity only
T = [ones(size(t)) t sin(2*pi*t/12) cos(2*pi*t/12)];
B=T\y2;
y_sincos = T*B;
% least squeare, using linear fit and 3,4,6,12 periodicities
addharmonics = [3 4 6];
T = [T bsxfun(#(h,t)sin(2*pi*t/h),addharmonics,t) bsxfun(#(h,t)cos(2*pi*t/h),addharmonics,t)];
B=T\y2;
y_sincos2 = T*B;
% least squeare with bisquare weights,
% using non-linear model of a linear fit and the 12 periodicity only
opts = statset('nlinfit');
opts.RobustWgtFun = 'bisquare';
b0 = [1;1;0;1];
modelfun = #(b,x) b(1)*x+b(2)*sin((b(3)+x)*2*pi/12)+b(4);
b = nlinfit(t,y2,modelfun,b0,opts);
% plot a comparison
figure
plot(t,y2,t,y_trend,t,modelfun(b,t),t,y_sincos,t,y_sincos2)
legend('Original','Trend','bisquare weight - 12 periodicity only', ...
'least square - 12 periodicity only','least square - 3,4,6,12 periodicities', ...
'Location','NorthWest');
xlim(minmax(t'));

Related

changing the range / limits on a polar chart in octave / matlab

I'm using Octave 4.0 using Linux which is similar to Matlab
Is it possible to have a different range of numbers on a polar chart and have them show up along with the degrees also?
The normal polar plot goes from 0-359 degrees shown in black in image, I would like the range and tick values to be from 0 to 100 shown in red in the image is this possible? If so can two ranges be shown on a polar plot at the same time (0-359 and 0-100) almost like plotting 2 y axis using plotyy on the same plot?
See image below of polar plot
Here's the the numbers 0-359 and there corresponding numbers 0-100 matching up.
0 0
1 0.27855
2 0.5571
3 0.83565
4 1.11421
5 1.39276
6 1.67131
7 1.94986
8 2.22841
9 2.50696
10 2.78552
11 3.06407
12 3.34262
13 3.62117
14 3.89972
15 4.17827
16 4.45682
17 4.73538
18 5.01393
19 5.29248
20 5.57103
21 5.84958
22 6.12813
23 6.40669
24 6.68524
25 6.96379
26 7.24234
27 7.52089
28 7.79944
29 8.07799
30 8.35655
31 8.6351
32 8.91365
33 9.1922
34 9.47075
35 9.7493
36 10.0279
37 10.3064
38 10.585
39 10.8635
40 11.1421
41 11.4206
42 11.6992
43 11.9777
44 12.2563
45 12.5348
46 12.8134
47 13.0919
48 13.3705
49 13.649
50 13.9276
51 14.2061
52 14.4847
53 14.7632
54 15.0418
55 15.3203
56 15.5989
57 15.8774
58 16.156
59 16.4345
60 16.7131
61 16.9916
62 17.2702
63 17.5487
64 17.8273
65 18.1058
66 18.3844
67 18.663
68 18.9415
69 19.2201
70 19.4986
71 19.7772
72 20.0557
73 20.3343
74 20.6128
75 20.8914
76 21.1699
77 21.4485
78 21.727
79 22.0056
80 22.2841
81 22.5627
82 22.8412
83 23.1198
84 23.3983
85 23.6769
86 23.9554
87 24.234
88 24.5125
89 24.7911
90 25.0696
91 25.3482
92 25.6267
93 25.9053
94 26.1838
95 26.4624
96 26.7409
97 27.0195
98 27.2981
99 27.5766
100 27.8552
101 28.1337
102 28.4123
103 28.6908
104 28.9694
105 29.2479
106 29.5265
107 29.805
108 30.0836
109 30.3621
110 30.6407
111 30.9192
112 31.1978
113 31.4763
114 31.7549
115 32.0334
116 32.312
117 32.5905
118 32.8691
119 33.1476
120 33.4262
121 33.7047
122 33.9833
123 34.2618
124 34.5404
125 34.8189
126 35.0975
127 35.376
128 35.6546
129 35.9331
130 36.2117
131 36.4903
132 36.7688
133 37.0474
134 37.3259
135 37.6045
136 37.883
137 38.1616
138 38.4401
139 38.7187
140 38.9972
141 39.2758
142 39.5543
143 39.8329
144 40.1114
145 40.39
146 40.6685
147 40.9471
148 41.2256
149 41.5042
150 41.7827
151 42.0613
152 42.3398
153 42.6184
154 42.8969
155 43.1755
156 43.454
157 43.7326
158 44.0111
159 44.2897
160 44.5682
161 44.8468
162 45.1253
163 45.4039
164 45.6825
165 45.961
166 46.2396
167 46.5181
168 46.7967
169 47.0752
170 47.3538
171 47.6323
172 47.9109
173 48.1894
174 48.468
175 48.7465
176 49.0251
177 49.3036
178 49.5822
179 49.8607
180 50.1393
181 50.4178
182 50.6964
183 50.9749
184 51.2535
185 51.532
186 51.8106
187 52.0891
188 52.3677
189 52.6462
190 52.9248
191 53.2033
192 53.4819
193 53.7604
194 54.039
195 54.3175
196 54.5961
197 54.8747
198 55.1532
199 55.4318
200 55.7103
201 55.9889
202 56.2674
203 56.546
204 56.8245
205 57.1031
206 57.3816
207 57.6602
208 57.9387
209 58.2173
210 58.4958
211 58.7744
212 59.0529
213 59.3315
214 59.61
215 59.8886
216 60.1671
217 60.4457
218 60.7242
219 61.0028
220 61.2813
221 61.5599
222 61.8384
223 62.117
224 62.3955
225 62.6741
226 62.9526
227 63.2312
228 63.5097
229 63.7883
230 64.0669
231 64.3454
232 64.624
233 64.9025
234 65.1811
235 65.4596
236 65.7382
237 66.0167
238 66.2953
239 66.5738
240 66.8524
241 67.1309
242 67.4095
243 67.688
244 67.9666
245 68.2451
246 68.5237
247 68.8022
248 69.0808
249 69.3593
250 69.6379
251 69.9164
252 70.195
253 70.4735
254 70.7521
255 71.0306
256 71.3092
257 71.5877
258 71.8663
259 72.1448
260 72.4234
261 72.7019
262 72.9805
263 73.2591
264 73.5376
265 73.8162
266 74.0947
267 74.3733
268 74.6518
269 74.9304
270 75.2089
271 75.4875
272 75.766
273 76.0446
274 76.3231
275 76.6017
276 76.8802
277 77.1588
278 77.4373
279 77.7159
280 77.9944
281 78.273
282 78.5515
283 78.8301
284 79.1086
285 79.3872
286 79.6657
287 79.9443
288 80.2228
289 80.5014
290 80.7799
291 81.0585
292 81.337
293 81.6156
294 81.8942
295 82.1727
296 82.4513
297 82.7298
298 83.0084
299 83.2869
300 83.5655
301 83.844
302 84.1226
303 84.4011
304 84.6797
305 84.9582
306 85.2368
307 85.5153
308 85.7939
309 86.0724
310 86.351
311 86.6295
312 86.9081
313 87.1866
314 87.4652
315 87.7437
316 88.0223
317 88.3008
318 88.5794
319 88.8579
320 89.1365
321 89.415
322 89.6936
323 89.9721
324 90.2507
325 90.5292
326 90.8078
327 91.0864
328 91.3649
329 91.6435
330 91.922
331 92.2006
332 92.4791
333 92.7577
334 93.0362
335 93.3148
336 93.5933
337 93.8719
338 94.1504
339 94.429
340 94.7075
341 94.9861
342 95.2646
343 95.5432
344 95.8217
345 96.1003
346 96.3788
347 96.6574
348 96.9359
349 97.2145
350 97.493
351 97.7716
352 98.0501
353 98.3287
354 98.6072
355 98.8858
356 99.1643
357 99.4429
358 99.7214
359 100
Here's an image of the numbers 0-359 and there corresponding numbers 0-100 matching up.
Numbers matching up
The polar plot object in Octave adds the rtick and ttick properties to the parent axes which allows you to change the location of the ticks, however, there is unfortunately no tticklabel property that we can use to easily find and modify the theta tick marks.
Instead, we can plot your plot that you want the theta range to be 0 - 100 by first transforming your data to instead be 0 - 2*pi (as is expected by polar). Then after plotting both, we can use findall to locate all text objects, figure out which ones are the theta ticks, create a copy of them and modify one of the sets to appear to be 0 - 100.
% Your first plot is going to use the 0 - 2pi range for theta
theta1 = linspace(0, 2*pi, 1000);
rho1 = sin(theta1 * 5);
plot1 = polar(theta1, rho1);
hold on
% For your second plot, just transform your 0 - 100 range to be 0 - 360 instead
theta2 = 0:100;
rho2 = linspace(0, 1, numel(theta2));
modtheta2 = 2*pi * (theta2 ./ 100);
plot2 = polar(modtheta2, rho2);
% Now we need to modify all of the labels
% Find all of the original labels
labels = findall(gca, 'type', 'text');
% Figure out which ones are the radial labels. To do this we compute the distance
% from the center of the plot and find the most common distance
distances = cellfun(#(x)norm(x(1:2)), get(labels, 'Position'));
% Figure out the most common
[~, ~, b] = unique(round(distances * 100));
h = hist(b, 1:max(b));
labels = labels(b == find(h == max(h)));
% Make a copy of these labels (have to use arrayfun for 4.0.x compatibility)
blacklabels = arrayfun(#(L)copyobj(L, gca), labels);
% Shift these labels outward by 15%
arrayfun(#(x)set(x, 'Position', get(x, 'Position') * 1.15), blacklabels);
% Now set the other labels to red and change their values
set(labels, 'COlor', 'red')
for k = 1:numel(labels)
value = str2num(get(labels(k), 'String'))
% Convert the value to be between 0 and 100
newvalue = 100 * (value / 360);
set(labels(k), 'String', sprintf('%0.2f', newvalue))
end

Creation of a loop loading values from .txt files

i have a problem creating a loop which loads each value from ".txt" files and uses it in some calculations.
All the values are on the 2nd column and the first one is always on the 9th line of each file.
Each ".txt" file contains a different number of values on its 2nd column (they all have the same text after the final value), so i want a loop that can read those values and stop whenever it finds that text)
Here is an example of these files ( the values that interest me are the ones under the headline of G (33,55,93...............,18) )
Latitude: 34°40'30" North,
Longitude: 3°16'6" East
Results for: April
Inclination of plane: 32 deg.
Orientation (azimuth) of plane: 0 deg.
Time G Gd Gc DNI DNIc A Ad Ac
05:52 33 33 25 0 0 233 64 311
06:07 55 44 47 246 361 356 105 473
06:22 93 59 92 312 459 444 124 590
06:37 136 73 147 366 538 514 138 684
06:52 183 86 207 410 602 572 150 760
07:07 232 98 271 447 656 620 160 823
07:22 283 110 337 478 701 659 168 874
16:37 283 110 337 478 701 659 168 874
16:52 232 98 271 447 656 620 160 823
17:07 183 86 207 410 602 572 150 760
17:22 136 73 147 366 538 514 138 684
17:37 93 59 92 312 459 444 124 590
17:52 55 44 47 246 361 356 105 473
18:07 33 33 25 0 0 233 64 311
18:22 18 18 14 0 0 9 8 7
G: Global irradiance on a fixed plane (W/m2)
Gd: Diffuse irradiance on a fixed plane (W/m2)
Gc: Global clear-sky irradiance on a fixed plane (W/m2)
DNI: Direct normal irradiance (W/m2)
DNIc: Clear-sky direct normal irradiance (W/m2)
A: Global irradiance on 2-axis tracking plane (W/m2)
Ad: Diffuse irradiance on 2-axis tracking plane (W/m2)
Ac: Global clear-sky irradiance on 2-axis tracking plane (W/m2)
PVGIS (c) European Communities, 2001-2012

Saving text matrix in a directory: MATLAB

I have a matrix, say A =
11084 2009 572 277 1095 685 636 365 545 697 518 490 747 1648;
11084 2010 1000 533 340 212 635 254 399 759 110 248 490 214;
11084 2011 587 410 481 146 99 499 547 118 706 20 174 526;
12813 2009 216 486 1443 207 730 369 518 625 816 767 382 1352;
12813 2010 673 544 517 204 704 504 219 1033 633 168 473 272;
12813 2011 348 238 458 107 90 394 1014 196 1109 34 365 250;
The column 1 indicates Station ID, I want to save the output in a separate directory in the name of station ID; such as in this case a text file will be created named 11084.txt which will contain foll. data:
2009 572;2009 277;2009 1095;2009 685;2009 636;2009 365;2009 545;2009 697;2009 518;2009 490;2009 747;2009 1648;2010 1000;2010 533;2010 340;2010 212;2010 635;2010 254;2010 399;2010 759;2010 110;2010 248;2010 490;2010 214;2011 587;2011 410;2011 481;2011 146;2011 99;2011 499;2011 547;2011 118;2011 706;2011 20;2011 174;2011 526;
similarly, next 12813.txt which will contain
2009 216;2009 486;2009 1443;2009 207;2009 730;2009 369;2009 18;2009 625;2009 816;2009 767;2009 382;2009 1352;2010 673;2010 44;2010 517;2010 204;2010 704;2010 504;2010 219;2010 1033;2010 633;2010 168;2010 473;2010 272;2011 348;2011 238;2011 458;2011 107;2011 90;2011 394;2011 1014;2011 196;2011 1109;2011 34;2011 365;
2011 250;
Please let me know how to do so. Thanks,
A straight forward solution is just:
d = unique(A(:,1));
for i = 1:length(d)
fid = fopen([num2str(d(i)) '.txt'],'w');
aux = find(A(:,1)==d(i))';
for j = aux
for k = 3:size(A,2)
fprintf(fid,'%d %d;', A(j,2), A(j,k));
end
end
fclose(fid);
end

find peaks and frequency from spectrum

suppose that we have following picture,which represent power spectral picture
my goal is following:
1.detect peak value of this power spectral picture
2.detect at which frequency it was
first of all,i have got this picture from following command
[Pxx,f]=periodogram(B,[],[],100);
plot(f,Pxx);
where B is input signal and 100 is sampling frequency,i have tried to use findpeaks command in matlab ,like this
[pxx_peaks,location]=findpeaks(Pxx);
and then find
f(location)
but it does not seem to fit to actual frequencies,so please tell me how to find frequencies from given peaks?thanks a lot of
example is following :
peaks are following :
0.417543614272817
0.389922187581014
0.381603315802419
0.601652859233616
0.396925294300794
0.369200511917405
0.477076452316346
0.792431584110476
0.612598437936600
0.564751537228850
0.940538666131177
0.600215481734847
0.985881201195063
0.950077461673118
1.24336273213410
1.84522775800633
1.73186592736729
3.46075557122590
4.93259798197976
8.47095716918618
25.2287636895831
1422.19492782494
60.8238733887811
11.3141744831953
8.65598040591953
3.92785491164888
2.51086405960291
2.27469230188760
1.90435488292485
1.25933693517960
1.52851575480462
0.933543409438383
1.21157308704582
0.821400666720535
1.28706199713640
1.19575886464231
0.736744959694641
0.986899895695809
0.758792061180657
0.542782326712391
0.704787750202814
0.998785634315287
0.522384453408780
0.602294251721841
0.525224294805813
0.624034405807298
0.498659616687732
0.656212420735658
0.866037361133916
0.624405636807668
0.435350646037440
1.22960953802959
0.891793067878849
1.06358076764171
1.34921178081181
1.02878577330537
1.93594290806582
1.14486512201656
2.01004088022982
2.24124811810385
2.15636037453584
4.81721425534142
4.87939454466131
10.5783535493504
27.0572221453758
1490.03057130613
62.3527480644562
13.6074209231800
9.85304975304259
16.3163128995995
74.1532966917877
1510.37374385290
27.7825124315786
8.66382951478539
7.72195587189507
6.06702456628919
3.35353608882459
4.90341095941571
5.07665716731356
4.47635486149688
9.79494608790444
22.9153086380666
1119.97978883924
57.0699524267842
15.2791339483160
5.36617545130941
3.90480316969632
2.58828964019220
1.16385064506181
1.55998411282069
1.14803074836796
0.468260146832541
0.467641715366303
0.698088976126660
0.504713663418641
0.375910057283262
0.331115262928959
0.204555648718379
0.182936666944843
0.293075999812128
0.272993318570981
0.280495615619829
0.148399626645134
location :
3
6
8
11
13
16
18
20
22
25
27
30
32
34
37
39
42
44
46
49
51
55
58
61
63
65
68
70
73
75
77
79
82
85
87
89
91
94
96
99
101
103
106
108
111
113
115
118
120
123
125
127
129
132
134
137
139
141
144
146
148
151
153
156
158
162
165
167
171
174
176
179
183
185
188
190
193
195
197
199
202
204
208
211
213
216
218
220
222
225
227
230
232
234
237
239
241
243
245
248
250
252
254
and f(location)
f(location)
ans =
0.3906
0.9766
1.3672
1.9531
2.3438
2.9297
3.3203
3.7109
4.1016
4.6875
5.0781
5.6641
6.0547
6.4453
7.0313
7.4219
8.0078
8.3984
8.7891
9.3750
9.7656
10.5469
11.1328
11.7188
12.1094
12.5000
13.0859
13.4766
14.0625
14.4531
14.8438
15.2344
15.8203
16.4063
16.7969
17.1875
17.5781
18.1641
18.5547
19.1406
19.5313
19.9219
20.5078
20.8984
21.4844
21.8750
22.2656
22.8516
23.2422
23.8281
24.2188
24.6094
25.0000
25.5859
25.9766
26.5625
26.9531
27.3438
27.9297
28.3203
28.7109
29.2969
29.6875
30.2734
30.6641
31.4453
32.0313
32.4219
33.2031
33.7891
34.1797
34.7656
35.5469
35.9375
36.5234
36.9141
37.5000
37.8906
38.2813
38.6719
39.2578
39.6484
40.4297
41.0156
41.4063
41.9922
42.3828
42.7734
43.1641
43.7500
44.1406
44.7266
45.1172
45.5078
46.0938
46.4844
46.8750
47.2656
47.6563
48.2422
48.6328
49.0234
49.4141
this seems to me like findpeaks does exactly what it should if used with no further parameters it
"finds local peaks in the data vector X. A local peak
is defined as a data sample which is either larger than the two
neighboring samples or is equal to Inf."
(http://www.mathworks.de/de/help/signal/ref/findpeaks.html)
since the only check this function does is testing if a point higher then is neighbours you get a lot of points in a noisy signal
you might want to limit the number of peaks findpeaks returns for example findpeaks(Pxx,'NPEAKS',n) only returns the n bigest peaks or findpeaks(X,'THRESHOLD',t) only returns the peaks which are over the threshold t
the best way might be findpeaks(X,'MINPEAKHEIGHT',m) to look for all peaks which are higher than m and determing m as a percentile of your input Pxx

Multidimensional scaling matrix error

I'm trying to use multidimensional scaling in Matlab. The goal is to convert a similarity matrix to scatter plot (in order to use k-means).
I've got the following test set:
London Stockholm Lisboa Madrid Paris Amsterdam Berlin Prague Rome Dublin
0 569 667 530 141 140 357 396 570 190
569 0 1212 1043 617 446 325 423 787 648
667 1212 0 201 596 768 923 882 714 714
530 1043 201 0 431 608 740 690 516 622
141 617 596 431 0 177 340 337 436 320
140 446 768 608 177 0 218 272 519 302
357 325 923 740 340 218 0 114 472 514
396 423 882 690 337 272 114 0 364 573
569 787 714 516 436 519 472 364 0 755
190 648 714 622 320 302 514 573 755 0
I got this dataset from the book Modern Multidimensional Scaling (Borg & Groenen, 2005). Tested it in SPSS using the PROXSCAL MDS method and I get the same result as stated in the book.
But I need to use MDS in Matlab in order to speed up the process. The tutorial on the site: http://www.mathworks.nl/help/stats/multidimensional-scaling.html#briu08r-4 looks the same as what I'm using above. When I change the data set as what is displayed above and run the code I get the following error: "Not a valid dissimilarity or distance matrix.".
I'm not sure what I'm doing wrong, and if classical MDS is the right choice. I also miss the possibility to say that I want the result in three dimensions (this will be needed in a later stage).
Your matrix is not symetric, check the indices (9,1) and (1,9). To quickly find asymetric indices use [x,y]=find(~(D'==D))