Performing Kernel Density Estimations in MATLAB - matlab

I have been using MATLAB to perform Kernel Density Estimations (KDE) on UTM data (X and Y coordinates). I ran into a problem that I do not seem to be understanding.
I perform the KDEs with a sample of 45 points. Everything works fine and I produce the graphs with contours.
[bandwidth,density,X,Y]=kde2d(data)
The function kde2d is code by Zdravko Botev. I obtained it from his file exchange on MathWorks. The variable 'data' is a 45x2 array of my data. The first column holds the X coordinates and the second the Y.
The problem comes when I try to do the same line of code on a subset of those 45 points. I get a recurring error:
Error using fzero (line 274)
The function values at the interval endpoints must differ in sign.
Error in kde2d (line 101)
t_star=fzero(#(t)(t-evolve(t)),[0,0.1]);
I get the same error for a bunch of those subsets on a bunch of different sets of 45 points.
The complete set has these 45 values:
1594436.281 572258.1272
1594418.48 572357.5859
1594471.362 572385.5186
1594516.726 572266.8206
1594415.313 572369.2754
1594519.701 572272.7153
1594415.377 572363.4139
1594468.365 572381.5779
1594518.139 572276.6059
1594425.496 572271.6874
1594524.259 572272.7651
1594502.555 572172.8749
1594516.747 572264.867
1594485.314 572360.2689
1594476.027 572375.7997
1594556.087 572419.6609
1594522.718 572274.7021
1594472.775 572395.3039
1594554.568 572419.6443
1594527.255 572276.7054
1594474.315 572393.3669
1594522.697 572276.6557
1594471.319 572389.4262
1594460.854 572373.6799
1594546.022 572228.0609
1594460.79 572379.5414
1594468.323 572385.4855
1594466.953 572371.7926
1594519.722 572270.7614
1594396.76 572398.3826
1594468.131 572403.0693
1594418.288 572375.1697
1594396.377 572433.5499
1594448.287 572271.9361
1594510.541 572276.523
1594424.466 572226.7345
1594413.773 572371.2124
1594511.848 572296.0774
1594513.367 572296.094
1594424.488 572224.7805
1594468.152 572401.1153
1594421.37 572371.2953
1594446.768 572271.9195
1594468.152 572401.1153
1594448.799 572225.0457
One of the subsets I am trying to use is this:
1594436.281 572258.1272
1594418.48 572357.5859
1594471.362 572385.5186
1594516.726 572266.8206
1594415.313 572369.2754
1594519.701 572272.7153
1594415.377 572363.4139
1594468.365 572381.5779
1594518.139 572276.6059
1594425.496 572271.6874
I am not sure if I should include any of Botev's code. I am hoping that the error message can be explained on its own. If not I can provide more. Thank you very much.

Related

Unclear documentation for ANN parameters

I'm dealing with some problem about reproducing results of ANN. Namely in the parameters of ANN there are 2 that I cannot address:
net. Inputs{i}.range
net.output.range
The documentation is very lapidary and cannot help me to understand how it works. Both seem to have massive impact on the output. Let's consider this MWE:
net=feedforwardnet(10);
net.inputs{1}.size=3;
%net.inputs{1}.range=[0 100;0 100;0 100];
%net.output.range=[1 200;];
net.layers{2}.size=1;
L1=[-1.1014, -2.1138, -2.6975;
-2.3545, 0.7693, 1.7621;
-1.1258, -1.4171, -3.1113;
-0.7845, -3.7105, 0.1605;
0.3993, 0.7042, 3.5076;
0.283, -3.914, -1.3428;
-2.0566, -3.4762, 1.3239;
-1.0626, 0.3662, 2.9169;
0.1367, 2.5801, 2.5867;
0.7155, 2.6237, 2.5376;];
B1=[3.5997, 3.1386, 2.7002, 1.8243, -1.9267, -1.6754, 0.8252, 1.0865, -0.0005, 0.6126];
L2=[0.5005, -1.0932, 0.34, -1.5099, 0.5896, 0.5881, 0.4769, 0.6728, -0.9407, -1.0296];
B2=0.1567;
net.IW{1}=L1;
net.Lw{2,1}=L2;
net.b{1}=B1';
net.b{2}=B2;
input=[40; 30; 20];
output=net(input)
If you uncomment line 3 and 4 the result rises from 0.1464 to 119.1379. I'm trying to reproduce this aspect of Matlab ANN in another environment, but the documentation is too short and does not explain anything.
What do these two parameters exactly do? I mean, what exact function is applied on the input and output data?

How to plot ROC if result is in text file in matlab

My program results in text (notepad) file, I want to plot FAR Vs FRR and EER Vs Threshold. Following is the result.
FAR, FRR, Accuracy, EER, Threshold
21.3248 46.6667 78.0417 21.9583 0.5467
23.5897 41.6667 75.9583 24.0417 0.5007
25.7265 40.8333 73.8958 26.1042 0.5168
28.8889 50.8333 70.5625 29.4375 0.5591
26.9658 43.3333 72.6250 27.3750 0.3973
17.0085 50.8333 82.1458 17.8542 0.4310
22.9274 43.3333 76.5625 23.4375 0.3339
16.0470 46.6667 83.1875 16.8125 0.4013
16.4530 43.3333 82.8750 17.1250 0.5091
18.8462 41.6667 80.5833 19.4167 0.5055
First, you need to load your data using:
load data.txt
result = data;
After that, you plot your ROC by referring to the following code:
plotroc(targets,outputs)
plotroc(targets1,outputs2,'name1',...)
you may see here: https://www.mathworks.com/help/deeplearning/ref/plotroc.html

Issues fitting an exponential function

I'm having some serious issues fitting an exponential function (Beer-Lambert law) to my data. The optimization toolset function that I'm using produces terrible fits:
function [ Coefficients ] = fitting_new( Modified_Spectrum_Data,trajectory )
x_axis = trajectory;
fun = #(x,x_axis) (x(1)*exp((-x(2))*x_axis));
start = [Modified_Spectrum_Data(1) 0.05];
nlm = nlinfit(x_axis,Modified_Spectrum_Data,fun,start,opts);
Coefficients = nlm;
end
Data:
Modified_Spectrum_Data = [1.11111111111111, 1.08784976353957, 1.06352170731165, 1.04099672033640, 1.02649723285838, 1.00423806910703, 0.994116452961827, 0.975928861361604, 0.963081773802984, 0.953191520906905, 0.940636278551651, 0.930360007604054, 0.922259178548511, 0.916659345499171, 0.909149956799775, 0.901241601559703, 0.895375741449218, 0.893308346234150, 0.887985459843162, 0.884657500398024, 0.883852990694089, 0.877158499678129, 0.874817832833850, 0.875428444059047, 0.873170360623947, 0.871461252768665, 0.867913776631497, 0.866459074988087, 0.863819528471106, 0.863228815347816 ,0.864369045426273 ,0.860602502500599, 0.862653463581049, 0.861169231463016, 0.858658616425390, 0.864588421841755, 0.858668693409622, 0.857993365648639]
trajectory = [0.0043, 0.9996, 2.0007, 2.9994, 3.9996, 4.9994, 5.9981, 6.9978, 7.9997, 8.9992, 10.0007, 10.9993, 11.9994, 12.9992, 14.0001, 14.9968, 15.9972, 16.9996, 17.9996, 18.999, 19.9992, 20.9996, 21.9994, 23.0003, 23.9992, 24.999, 25.9987, 26.9986, 27.999, 28.9991, 29.999, 30.9987, 31.9976, 32.9979, 33.9983, 34.9988, 35.999, 36.9991]
I've tried using multiple different fitting functions and messing around with the options, but they don't seem to make too big of a difference. Additionally, I've tried changing the initial guess, but again that doesn't really make a difference.
Excel seems to be able to fit the data perfectly fine, but I have 900 rows of data I want to fit so doing it in Excel is not possible.
Any help would be greatly appreciated, thank you.
You'll want to use the cftool. Your data looks to follow a power law. Then choose 'Modified Spectrum Data' as your x axis and 'Trajectory' as your y. Select 'Power' from the drop down menu towards the top of the GUI.
Modified_Spectrum_Data = [1.11111111111111, 1.08784976353957, 1.06352170731165, 1.04099672033640, 1.02649723285838, 1.00423806910703, 0.994116452961827, 0.975928861361604, 0.963081773802984, 0.953191520906905, 0.940636278551651, 0.930360007604054, 0.922259178548511, 0.916659345499171, 0.909149956799775, 0.901241601559703, 0.895375741449218, 0.893308346234150, 0.887985459843162, 0.884657500398024, 0.883852990694089, 0.877158499678129, 0.874817832833850, 0.875428444059047, 0.873170360623947, 0.871461252768665, 0.867913776631497, 0.866459074988087, 0.863819528471106, 0.863228815347816 ,0.864369045426273 ,0.860602502500599, 0.862653463581049, 0.861169231463016, 0.858658616425390, 0.864588421841755, 0.858668693409622, 0.857993365648639]
trajectory = [0.0043, 0.9996, 2.0007, 2.9994, 3.9996, 4.9994, 5.9981, 6.9978, 7.9997, 8.9992, 10.0007, 10.9993, 11.9994, 12.9992, 14.0001, 14.9968, 15.9972, 16.9996, 17.9996, 18.999, 19.9992, 20.9996, 21.9994, 23.0003, 23.9992, 24.999, 25.9987, 26.9986, 27.999, 28.9991, 29.999, 30.9987, 31.9976, 32.9979, 33.9983, 34.9988, 35.999, 36.9991]
cftool
Screenshot:
For more information on the curve fitting (cftool), see: https://www.mathworks.com/help/curvefit/curvefitting-app.html

Matlab: efficienting portion of code, random start

I have the following code that generates a matrix of 15 blocks that will then be used in a Montecarlo approach as multiple starting points. How can I get the same exact result in a smarter way?
assume that J=15*100 are the total simulation and paramNum the number of parameters
[10^-10*ones(paramNum,round(J/15)) 10^-9*ones(paramNum,round(J/15)) 10^-8*ones(paramNum,round(J/15)) 10^-7*ones(paramNum,round(J/15)) 10^-6*ones(paramNum,round(J/15)) 10^-5*ones(paramNum,round(J/15)) rand*10^-5*ones(paramNum,round(J/15)) 10^-4*ones(paramNum,round(J/15)) rand*10^-4*ones(paramNum,round(J/15)) 10^-3*ones(paramNum,round(J/15)) 10^-2*ones(paramNum,round(J/15)) 10^-1*ones(paramNum,round(J/15)) 10^-abs(randn/2)*ones(paramNum,round(J/15))];
you could do
v = 10.^[-10:-5 rand*10^-5 -4:-1 10^-abs(randn/2)];
repmat(repelem(v, 1, round(J/15)), paramNum) .* ...
repmat(ones(paramNum,round(J/15)), numel(v))
Or mimic the repmat/repelem functionality with a for loop. The first is shorter, the later is more understandable.
By the way... it's less than 15 blocks...

Function equivalent to SUM() for multiplication in SQL Reporting

I'm looking for a function or solution to the following:
For the chart in SQL Reporting i need to multiply values from a Column A. For summation i would use =SUM(COLUMN_A) for the chart. But what can i use for multiplication - i was not able to find a solution so far?
Currently i am calculating the value of the stacked column as following:
=ROUND(SUM(Fields!Value_Is.Value)/SUM(Fields!StartValue.Value),3)
Instead of SUM i need something to multiply the values.
Something like that:
=ROUND(MULTIPLY(Fields!Value_Is.Value)/MULTIPLY(Fields!StartValue.Value),3)
EDIT #1
Okay tried to get this thing running.
The expression for the chart looks like this:
=Exp(Sum(Log(IIf(Fields!Menge_Ist.Value = 0, 10^-306, Fields!Menge_Ist.Value)))) / Exp(Sum(Log(IIf(Fields!Startmenge.Value = 0, 10^-306, Fields!Startmenge.Value))))
If i calculate my 'needs' manually i have to get the following result:
In my SQL Report i get the following result:
To make it easier, these are the raw values:
and you have the possibility to group the chart by CW, CQ or CY
(The values from the first pictures are aggregated Sum values from the raw values by FertStufe)
EDIT #2
Tried your expression, which results in this:
Just to make it clear:
The values in the column
=Value_IS / Start_Value
in the first picture are multiplied against each other
0,9947 x 1,0000 x 0,59401 = 0,58573
Diffusion Calenderweek 44 Sums
Startvalue: 1900,00 Value Is: 1890,00 == yield:0,99474
Waffer unbestrahlt Calenderweek 44 Sums
Startvalue: 620,00 Value Is: 620,00 == yield 1,0000
Pellet Calenderweek 44 Sums
Startvalue: 271,00 Value Is: 160,00 == yield 0,59041
yield Diffusion x yield Wafer x yield Pellet = needed Value in chart = 0,58730
EDIT #3
The raw values look like this:
The chart ist grouped - like in the image - on these fields
CY (Calendar year), CM (Calendar month), CW (Calendar week)
You can download the data as xls here:
https://www.dropbox.com/s/g0yrzo3330adgem/2013-01-17_data.xls
The expression i use (copy / past from the edit window)
=Exp(Sum(Log(Fields!Menge_Ist.Value / Fields!Startmenge.Value)))
I've exported the whole report result to excel, you can get it here:
https://www.dropbox.com/s/uogdh9ac2onuqh6/2013-01-17_report.xls
it's actually a workaround. But I am pretty sure is the only solution for this infamous problem :D
This is how I did:
Exp(∑(Log(X))), so what you should do is:
Exp(Sum(Log(Fields!YourField.Value)))
Who said math was worth nothing? =D
EDIT:
Corrected the formula.
By the way, it's tested.
Addressing Ian's concern:
Exp(Sum(Log(IIf(Fields!YourField.Value = 0, 10^-306, Fields!YourField.Value))))
The idea is change 0 with a very small number. Just an idea.
EDIT:
Based on your updated question this is what you should do:
Exp(Sum(Log(Fields!Value_IS.Value / Fields!Start_Value.Value)))
I just tested the above code and got the result you hoped for.