Stopping criteria for fminsearch in Matlab - matlab

I am using fminsearch to fit parameters for a system of DEs to observed data. I am not expecting to get a great fit.
fminsearch pretty quickly finds what appears to be an acceptable min for the objective function, but then does not stop. It's running for a really long time, and I cannot figure out why.
I am using the options
options = optimset('Display','iter','TolFun',1e-4,'TolX',1e-4,'MaxFunEvals',1000);
which I understood to mean that when the value of the objective function drops to below 1e-4 that would be considered sufficient. Alternatively when they could no longer change the parameters whatever is the best would be returned.
The output is
Iteration Func-count min f(x) Procedure
0 1 8.13911e+10
1 8 7.2565e+10 initial simplex
2 9 7.2565e+10 reflect
3 10 7.2565e+10 reflect
4 11 7.2565e+10 reflect
5 12 7.2565e+10 reflect
6 13 7.2565e+10 reflect
7 15 6.85149e+10 expand
8 16 6.85149e+10 reflect
9 17 6.85149e+10 reflect
10 19 6.20681e+10 expand
11 20 6.20681e+10 reflect
12 22 5.55199e+10 expand
13 23 5.55199e+10 reflect
14 25 4.86494e+10 expand
15 26 4.86494e+10 reflect
16 27 4.86494e+10 reflect
17 29 3.65616e+10 expand
18 30 3.65616e+10 reflect
19 31 3.65616e+10 reflect
20 33 2.82946e+10 expand
21 34 2.82946e+10 reflect
22 36 2.02985e+10 expand
23 37 2.02985e+10 reflect
24 39 1.20011e+10 expand
25 40 1.20011e+10 reflect
26 41 1.20011e+10 reflect
27 43 5.61651e+09 expand
28 44 5.61651e+09 reflect
29 45 5.61651e+09 reflect
30 47 2.1041e+09 expand
31 48 2.1041e+09 reflect
32 49 2.1041e+09 reflect
33 51 5.15751e+08 expand
34 52 5.15751e+08 reflect
35 53 5.15751e+08 reflect
36 55 7.99868e-05 expand
37 56 7.99868e-05 reflect
38 58 7.99835e-05 reflect
39 59 7.99835e-05 reflect
I have previously let this run for a lot longer and it's stuck with the same min f(x) for at least the next 30 print outs.
How do I set the options correctly so that when it finds a solution within an acceptable value for the objective function it stops?

Matlab requires that both TolX AND TolFun be satisfied before terminating ("Unlike other solvers, fminsearch stops when it satisfies both TolFun and TolX." See: https://www.mathworks.com/help/matlab/ref/fminsearch.html). You should check what the "x" value (your solution) is doing. I suspect that is changing more than your tolerance specification for each step. (i.e. the value of x is changing more than TolX between iterations but f(x) is not changing by more than TolFun).

Related

Getting started with LibSVM for a particular case

I have read quite a lot about LibSVM library, but I would like to ask you for some advices in my particular case. The problem is that I have some 3D medical images (DCE-MRI) of a stomach. My goal is to perform a segmentation of a kidney, and find its three parts. Therefore, I need to train a classifier - I'm going to use SVM and neural network
Feature vectors:
What is available is the pixel (voxel) brightness value (I guess the value range is [0; 511]). In total, there are 71 frames, each taken every second. So the crucial feature of every voxel is how the voxel brightness/intensity is changing during the examination time. In my case, every part of a kidney has a different chart (see an example below), so the way how the voxels brightness is changing over the time will be used by the classifier.
Training sets:
Every training set is a vector of intensity value of one voxel (74 numbers). An example is presented below:
[22 29 21 7 19 12 23 25 33 28 25 5 21 18 27 21 11 11 26 12 12 31 15 15 12 29 17 34 30 11 12 24 35 28 27 26 29 22 15 23 24 14 14 37 241 313 350 349 382 402 333 344 332 366 339 383 383 379 394 398 402 357 346 379 365 376 366 365 360 363 376 383 389 385]
Summary and question to you:
I have many training sets consisting of 74 values from the range [0; 511]. I have 3 groups of voxels, which have a characteristic feature - the brightness is changing in the similar way. What I want to obtain is a classificator, which after getting one voxel vector with 74 numbers, will assess if the voxel belongs to one of these 3 groups, or to none of them.
Question: how to start with LibSVM, any advices? From what I know now is that I should transform input values to be from the range [0; 1] or [-1; 1]. I have many training sets prepared belonging to one of these 3 groups. I will be grateful for any advice, as I'm a newbie and I just need some tips just to start.
You can train and use you model like this:
model=svmtrain(train_label,train_feature,'-c 1 -g 0.07 -h 0');
% the parameters can be modified
[label, accuracy, probablity]=svmpredict(test_label,test_feaure,model);
train_label must be a vector,if there are more than two kinds of input(0/1),it will be an nSVM automatically. If you have 3 classes, you can label them using {1,2,3}.Its length is equal to the number of samples.
The feature is not restricted. It can be what ever you want.
However, you'd better preprocess them to make the results better. For example, you can change range[0:511] to range[0:1] or minus the mean of the feature.
Notice that the testset data should be preprocessed in the same way.
Hope this will help you!

Displaying data to a map, creating a choropleth

What I would like to do is create a choropleth map which is darker or lighter based on the number of data points in a particular area.
I have the following data:
RO-B, 9
PL-MZ, 24
SE-C, 3
DE-NI, 5
PL-DS, 14
ES-CM, 11
RO-IS, 2
DE-BY, 51
SE-Z, 18
CH-BE, 10
PL-WP, 1
ES-IB, 1
DE-BW, 21
DE-BE, 24
DE-BB, 1
IE-M, 26
ES-PV, 1
DE-SN, 6
CH-ZH, 31
ES-GA, 1
NL-GE, 2
IE-U, 1
ES-AN, 4
FR-J, 82
DE-HH, 34
PL-PD, 1
PL-LD, 6
GB-WLS, 60
GB-ENG, 8619
RO-BV, 45
CH-VD, 2
PL-SL, 1
DE-HE, 17
SE-I, 1
HU-PE, 4
PL-MA, 4
SE-AB, 3
CH-BS, 20
ES-CT, 31
DE-TH, 25
IE-C, 1
CZ-ST, 1
DE-NW, 29
NL-NH, 3
DE-RP, 9
CZ-PR, 4
IE-L, 134
HU-BU, 10
RO-CJ, 1
GB-NIR, 29
ES-MD, 33
CH-LU, 11
GB-SCT, 172
CH-GE, 3
BE-BRU, 30
BE-VLG, 25
It references the ISO3166-2 of a country and sub region, and the # corresponds to the amount of data points affiliated with that region.
I've seen this project on GitHub which seems to also use the same ISO3166-2 to reference countries.
I'm trying to figure out how I could modify their code to display my data points, such that if the number is higher the area would be darker, if the number is less it would be lighter.
It seems it should be possible, the first thing I was trying to do was modify this jsfiddle code, which seems to be very close to what I need, but I couldn't get it to work.
For instance this line:
osmeRegions.geoJSON('RU-MOW',
Seems to directly reference a ISO3166-2 code, but it's not as simple as just changing that (or maybe it is but I couldn't get that to work properly).
Does anyone know if I could possibly adapt the code from that project to create the map rendering I've described?
Or perhaps there's a different way to achieve the same goal?

How to display all x-labels on 'bar' plot?

I have the following data that I wish to plot in a bar graph in MatLab:
publications = [15 12 35 12 19 14 21 15 7 16 40 28 6 13 16 6 7 22 23 16 45];
bar(publications,0.4)
set(gca,'XTickLabel',{'G1','G2','G3','G4','G5','G6','G7','G8','G9','G10',...
'G11','G12','G14','G16','G17','G18','G19','G20','G21','G22','G23'})
However, when I execute this, I get the following plot:
Obviously the x-label is incorrect here as the first bar should have the x-label 'G1', the second should have 'G2', etc, until we get to the last bar which is supposed to have 'G23'.
If anyone knows how I can fix this, I would really, really appreciate it!
Add the following line:
set(gca,'XTick',1:numel(publications))
before you set the labels.
Now it depends how big your resulting plot is, because the labels are a little packed.
You may adjust fontsize or Orientation or the gaps between the bars.
Probably the publication names are a little longer so a 90° rotation is the best and you may find this answer or this link helpful.
Another suggestion would be to use barh and rotate after print:
publications = [15 12 35 12 19 14 21 15 7 16 40 28 6 13 16 6 7 22 23 16 45];
bh = barh(publications,0.4)
set(gca','XAxisLocation','top')
set(gca,'YTick',1:numel(publications))
set(gca,'YTickLabel',{'G1','G2','G3','G4','G5','G6','G7','G8','G9','G10',...
'G11','G12','G14','G16','G17','G18','G19','G20','G21','G22','G23'})

Stata longwise average

I'm using Stata and trying to compute conditional means based on time/date. For each store I want to calculate mean (inventory) per year. If there are missing year gaps, then I want to take the mean from the closest two dates' inventory values.
I've used (below) to get overall means per store, but I need more granularity.
egen mean_inv = mean(inventory), by (store)
I've also tried this loop with similar results:
by id, sort: gen v1'=_n'
forvalues x = 1/'=n'{
by store: sum inventory if v1==`x'
replace mean_inv= r(mean) if v1==`x'
}
Visually, I want mean inventory per store: (store id is not sequential)
5/1/2003 2/3/2006 8/9/2006 3/5/2007 6/9/2007 2/1/2008
13 18 12 15 24 11
[mean1] [mean2] [mean3] [mean4] [mean5]
store date inventory
1 16750 17
1 18234 16
1 15844 13
1 17111 14
1 17870 13
1 16929 13.5
1 17503 13
4 15987 18
4 15896 16
4 18211 16
4 17154 18
4 17931 24
4 16776 23
12 16426 26
12 17681 17
12 16386 17
12 16603 18
12 17034 16
12 17205 16
42 15798 18
42 16022 18
42 17496 16
42 17870 18
42 16204 18
42 16778 14
33 18053 23
33 16086 13
33 16450 21
33 17374 19
33 16814 19
33 15834 16
33 16167 16
56 17686 16
56 17623 18
56 17231 20
56 15978 16
56 16811 15
56 17861 20
It is hard to relate your code to the word description of your problem.
Your egen call calculates means by store, not year.
Your longer fragment does not make complete sense given lack of definitions and at least one typo.
Note that your variable v1 contains identifiers that run 1 up within groups of store, and does not distinguish different values of store, as you (seem to) imply. It strains credibility that it produces results anywhere near those by the egen call.
n is not defined and the code evaluating it is presumably intended to be
`=n'
If you calculate
by store: sum inventory if v1 == `x'
several means will be calculated in turn but only the last to be calculated will be accessible as r(mean).
The sample data are unrelated to the problem. There is no year variable and even if the dates are Stata daily dates, they are all dates within 1960.
Setting all that aside, suppose you have variables store, inventory and year. You can try
collapse inventory, by(store year)
fillin store year
ipolate inventory year, gen(inventory2) by(store)
The collapse produces a reduced dataset of means. The ipolate interpolates across gaps, as you ask. fillin may not be adequate to give all the store and year combinations you want and you may need to add further years manually before the interpolation. If you want to put these results back with the original data, that's a merge.
In total, this is a pretty messy question.

cluster my data and testing of input

cluster the given data and use any retrieval algorithm to show output as shown below.
(any clustering algorithm)
Euclidean distance may be used for finding closest cases.
let a data file containing input vectors like
caseid f1 f2 f3 f4
1 30 45 9.5 1500
2 35 45 8 1600
3 38 47 10 1550
4 32 50 9.5 1800
..
..
..
t1 30 45 9.5 1500(target)
output should like
NO. f1 f2 f3 f4
t1 30 45 9.5 1500 (target)
21 35 45 10 1500(1st closest to target)
39 35 50 8 1500 (2nd closes)
56 35 42 9.5 1500 (3rd closes)
This looks like a classic nearest neighbor query to me, not like clustering.
Also I'd be careful with using Euclidean distance here. A difference of 1 in attribute f1 does not look like it is equal to a difference of 1 in attribute f4. The values seem to have a completely different magnitude.