I have this MATLAB code and I´m trying to implement the method explained in top answer of this question: https://stats.stackexchange.com/questions/12546/software-package-to-solve-l-infinity-norm-linear-regression
Here is the code that I´m using that starts with the data points:
x = [
0
0.101010101010101
0.202020202020202
0.303030303030303
0.404040404040404
0.505050505050505
0.606060606060606
0.707070707070707
0.808080808080808
0.909090909090909
];
y = [
0.052993311292562
14.923120014175920
1.974502763975613
-2.205773310050583
-0.052548781318830
2.935428041987883
0.134606520161892
0.146742215922384
-0.418386565682831
1.702041272689124
];
A1 = [x,ones(length(y),1),-ones(length(y),1)];
A2 = [-x,-ones(length(y),1),-ones(length(y),1)];
A = [A1;A2];
f = [0;0;1];
linprog(f,A,[y;-y])
The point is to find the the parameters (slope and intersection) of the best fit, i.e. a line, by minimizing the L-infinity norm of the residuals between the line and data points. I have made the same problem work for ordinary least squares (minimizing the L-2 norm) as well as for the L-1 fit. The line plotted from those methods fit really nicely between the data points. But can't seem to make this L-infinity fit work no matter what I do so I come to you for help, any tips appreciated.
The sign of t in your inequalities is wrong. Try
A1 = [x,ones(length(y),1),-ones(length(y),1)];
A2 = [-x,-ones(length(y),1),-ones(length(y),1)];
Related
I am trying to solve equations with this code:
a = [-0.0008333 -0.025 -0.6667 -20];
length_OnePart = 7.3248;
xi = -6.4446;
yi = -16.5187;
syms x y
[sol_x,sol_y] = solve(y == poly2sym(a), ((x-xi)^2+(y-yi)^2) == length_OnePart^2,x,y,'Real',true);
sol_x = sym2poly(sol_x);
sol_y = sym2poly(sol_y);
The sets of solution it is giving are (-23.9067,-8.7301) and (11.0333,-24.2209), which are not even satisfying the equation of circle. How can I rectify this problem?
If you're trying to solve for the intersection of the cubic and the circle, i.e., where y==poly2sym(a) equals (x-xi)^2+(y-yi)^2==length_OnePart^2 it looks like solve may be confused about something when the circle is represented parametrically rather than as single valued functions. It might also have to do with the fact that x and y are not independent solutions, but rather that the latter depends on the former. It also could depend on the use of a numeric solver in this case. solve seems to work fine with similar inputs to yours, so you might report this behavior to the MathWorks to see what they think.
In any case, here is a better, more efficient way to to tackle this as a root-solving problem (as opposed to simultaneous equations):
a = [-0.0008333 -0.025 -0.6667 -20];
length_OnePart = 7.3248;
xi = -6.4446;
yi = -16.5187;
syms x real
f(x) = poly2sym(a);
sol_x = solve((x-xi)^2+(f(x)-yi)^2==length_OnePart^2,x)
sol_y = f(sol_x)
which returns:
sol_x =
0.00002145831413371390464567553686047
-13.182825373861454619370838716408
sol_y =
-20.000014306269544436430325843024
-13.646590348358951818881695033728
Note that you might get slightly more accurate results (one solution is clearly at 0,-20) if you represent your coefficients and parameters more precisely then just four decimal places, e.g., a = [-1/1200 -0.025 -2/3 -20]. In fact, solve might be able to find one or more solutions exactly, if you provide exact representations.
Also, in your code, the calls to sym2poly are doing nothing other than converting back to floating-point (double can be used for this) as the inputs are not in the form of symbolic polynomial equations.
I would like to calibrate a interest rate tree using the optimization tool in matlab. Need some guidance on doing it.
The interest rate tree looks like this:
How it works:
3.73% = 2.5%*exp(2*0.2)
96.40453 = (0.5*100 + 0.5*100)/(1+3.73%)
94.15801 = (0.5*96.40453+ 0.5*97.56098)/(1+2.50%)
The value of 2.5% is arbitrary and the upper node is obtained by multiplying with an exponential of 2*volatility(here it is 20%).
I need to optimize the problem by varying different values for the lower node.
How do I do this optimization in Matlab?
What I have tried so far?
InterestTree{1}(1,1) = 0.03;
InterestTree{3-1}(1,3-1)= 2.5/100;
InterestTree{3}(2,:) = 100;
InterestTree{3-1}(1,3-2)= (2.5*exp(2*0.2))/100;
InterestTree{3-1}(2,3-1)=(0.5*InterestTree{3}(2,3)+0.5*InterestTree{3}(2,3-1))/(1+InterestTree{3-1}(1,3-1));
j = 3-2;
InterestTree{3-1}(2,3-2)=(0.5*InterestTree{3}(2,j+1)+0.5*InterestTree{3}(2,j))/(1+InterestTree{3-1}(1,j));
InterestTree{3-2}(2,3-2)=(0.5*InterestTree{3-1}(2,j+1)+0.5*InterestTree{3-1}(2,j))/(1+InterestTree{3-2}(1,j));
But I am not sure how to go about the optimization. Any suggestions to improve the code, do tell me..Need some guidance on this..
Are you expecting the tree to increase in size? Or are you just optimizing over the value of the "2.5%" parameter?
If it's the latter, there are two ways. The first is to model the tree using a closed form expression by replacing 2.5% with x, which is possible with the tree. There are nonlinear optimization toolboxes available in Matlab (e.g. more here), but it's been too long since I've done this to give you a more detailed answer.
The seconds is the approach I would immediately do. I'm interpreting the example you gave, so the equations I'm using may be incorrect - however, the principle of using the for loop is the same.
vol = 0.2;
maxival = 100;
val1 = zeros(1,maxival); %Preallocate
finalval = zeros(1,maxival);
for ival=1:maxival
val1(ival) = i/1000; %Use any scaling you want. This will go from 0.1% to 10%
val2=val1(ival)*exp(2*vol);
x1 = (0.5*100+0.5*100)/(1+val2); %Based on the equation you gave
x2 = (0.5*100+0.5*100)/(1+val1(ival)); %I'm assuming this is how you calculate the bottom node
finalval(ival) = x1*0.5+x2*0.5/(1+...); %The example you gave isn't clear, so replace this with whatever it should be
end
[maxval, indmaxval] = max(finalval);
The maximum value is in maxval, and the interest that maximized this is in val1(indmaxval).
Ignore the red fitted curve first. I'd like to get a curve to the blue datapoints. I know the first part (up to y~200 in this case) is linear, then a different curve (combination of two logarithmic curves but could also be approximated differently) and then it saturates at about 250 or 255. I tried it like this:
func = fittype('(x>=0 & x<=xTrans1).*(A*x+B)+(x>=xTrans1 & x<=xTrans2).*(C*x+D)+(x>=xTrans2).*E*255');
freg = fit(foundData(:,1), foundData(:,2), func);
plot(freg, foundData(:,1), foundData(:,2))
Okay obviously my fittype could be improved, but why is it actually THAT bad/wrong?
I tried another simpler model:
func = fittype('(x>=0 & x<=xTrans1).*(A*x+B)+(x>=xTrans1).*(C*x+D)')
freg = fit(foundData(:,1), foundData(:,2), func);
plot(freg, foundData(:,1), foundData(:,2))
At least I'd expect there two be two linear functions, and what I get is:
Or is it only the plot which is wrong because the output of the fit is:
General model:
f_fit(x) = (x>=0 & x<=xTrans1).*(A*x+B)+(x>=xTrans1).*(C*x+D)
Coefficients (with 95% confidence bounds):
A = 0.6491
B = 0.7317
C = 0.0007511
D = 143.5
xTrans1 = 0.547
Which at least yields a good xTrans1 (but I can't see it in the plot)!
EDIT
Thanks for pointing out the more clear way of programming the function to fit, I tried the following (three different linear functions with two transition points):
function y = singleRegression_ansatzfunktion(x,xtrans1,xtrans2,a,b,c,d,e,f)
y = zeros(size(x));
% 3 Geradengleichungen:
for i = 1:length(x)
if x(i) < xtrans1
y(i) = a + b.* x(i);
elseif(x(i) < xtrans2)
y(i) = c + d.* x(i);
else
y(i) = e + f.* x(i);
end
end
Calling the fitter like that:
freg = fit(foundData(:,1), foundData(:,2), 'singleRegression_ansatzfunktion(x,xtrans1,xtrans2,a,b,c,d,e,f)');
plot(freg, foundData(:,1), foundData(:,2))
Resulting in:
General model:
f(x) = singleRegression_ansatzfunktion(x,xtrans1,xtrans2,a,b,c,d,e,f)
Coefficients (with 95% confidence bounds):
a = 0.7655
b = 0.7952
c = 0.1869
d = 0.4898
e = 159.2
f = 0.0005512
xtrans1 = 0.7094
xtrans2 = 0.7547
!!!!Strange!!!!
EDIT2
When NOT letting MATLAB optimize the transition points but entering them myself like I shortly did in the cftool (should be the same like calling fit but was quicker to figure it out) via the custom equation:
(x>=0 & x<=2.9e4).*(A*x+B)+(x>2.9e4 & x<=1.3e5).*(B*x+D)+(x>1.3e5).*255
It worked pretty well. I don't know why MATLAB can't do this on his own but okay... There you go now as a result:
So at least I fixed it now but I still remain in doubt why MATLAB simply couldn't do this itself.
Have you tried the approach in the fittype documentation page ("Fit Curve Defined by a File" example) i.e. define your function to fit in a file to see if it makes a difference?
The other approach I can think of would be to split your data in two (or more) different datasets and do two separate fits for each chunk (but that assumes you know a priori where the transition point(s) is/are or can work it/them out before fitting).
I've read up on fsolve and solve, and tried various methods of curve fitting/regression but I feel I need a bit of guidance here before I spend more time trying to make something work that might be the wrong approach.
I have a series of equations I am trying to fit to a data set (x) separately:
for example:
(a+b*c)*d = x
a*(1+b*c)*d = x
x = 1.9248
3.0137
4.0855
5.0097
5.7226
6.2064
6.4655
6.5108
6.3543
6.0065
c= 0.0200
0.2200
0.4200
0.6200
0.8200
1.0200
1.2200
1.4200
1.6200
1.8200
d = 1.2849
2.2245
3.6431
5.6553
8.3327
11.6542
15.4421
19.2852
22.4525
23.8003
I know c, d and x - they are observations. My unknowns are a and b, and should be constant.
I could do it manually for each x observation but there must be an automatic and far superior way or at least another approach.
Very grateful if I could receive some guidance. Thanks for the time!
Given your two example equations; let y=x./d, then
y = a+b*c
y = a+a*b*c
The first case is just a line, for which you can obtain a least squares fit (values for a and b) with polyfit(). In the second case, you can just say k=a*b (since these are both fitted anyway), then rewrite it as:
y = a+k*c
Which is exactly the same line as the first problem, except now b = k/a. In fact, b=b1/a is the solution to the second problem where b1 is the fit from the first problem. In short, to solve them both, you need one call to polyfit() and a couple of divisions.
Will that work for you?
I see two different equations to fit here. To spell out the code:
For (a+b*c)*d = x
p = polyfit(c, x./d, 1);
a = p(2);
b = p(1);
For a*(1+b*c)*d = x
p = polyfit(c, x./d, 1);
a = p(2);
b = p(1) / a;
No need for polyfit; this is just a linear least squares problem, which is best solved with MATLAB's slash operator:
>> ab = [ones(size(c)) c] \ (x./d)
ans =
1.411437211703194e+000 % 'a'
-7.329687661579296e-001 % 'b'
Faster, cleaner, more educative :)
And, as Emmet already said, your second equation is nothing more than a different form of your first equation, the difference being that the b in your first equation, is equal to a*b in your second one.
I’m currently a Physics student and for several weeks have been compiling data related to ‘Quantum Entanglement’. I’ve now got to a point where I have to plot my data (which should resemble a cos² graph - and does) to a sort of “best fit” cos² graph. The lab script says the following:
A more precise determination of the visibility V (this is basically how 'clean' the data is) follows from the best fit to the measured data using the function:
f(b) = A/2[1-Vsin(b-b(center)/P)]
Granted this probably doesn’t mean much out of context, but essentially A is the amplitude, b is an angle and P is the periodicity. Hence this is also a “wave” like the experimental data I have found.
From this I understand, as previously mentioned, I am making a “best fit” curve. However, I have been told that this isn’t possible with Excel and that the best approach is Matlab.
I know intermediate JavaScript but do not know Matlab and was hoping for some direction.
Is there a tutorial I can read for this? Is it possible for someone to go through it with me? I really have no idea what it entails, so any feed back would be greatly appreciated.
Thanks a lot!
Initial steps
I guess we should begin by getting a representation in Matlab of the function that you're trying to model. A direct translation of your formula looks like this:
function y = targetfunction(A,V,P,bc,b)
y = (A/2) * (1 - V * sin((b-bc) / P));
end
Getting hold of the data
My next step is going to be to generate some data to work with (you'll use your own data, naturally). So here's a function that generates some noisy data. Notice that I've supplied some values for the parameters.
function [y b] = generateData(npoints,noise)
A = 2;
V = 1;
P = 0.7;
bc = 0;
b = 2 * pi * rand(npoints,1);
y = targetfunction(A,V,P,bc,b) + noise * randn(npoints,1);
end
The function rand generates random points on the interval [0,1], and I multiplied those by 2*pi to get points randomly on the interval [0, 2*pi]. I then applied the target function at those points, and added a bit of noise (the function randn generates normally distributed random variables).
Fitting parameters
The most complicated function is the one that fits a model to your data. For this I use the function fminunc, which does unconstrained minimization. The routine looks like this:
function [A V P bc] = bestfit(y,b)
x0(1) = 1; %# A
x0(2) = 1; %# V
x0(3) = 0.5; %# P
x0(4) = 0; %# bc
f = #(x) norm(y - targetfunction(x(1),x(2),x(3),x(4),b));
x = fminunc(f,x0);
A = x(1);
V = x(2);
P = x(3);
bc = x(4);
end
Let's go through line by line. First, I define the function f that I want to minimize. This isn't too hard. To minimize a function in Matlab, it needs to take a single vector as a parameter. Therefore we have to pack our four parameters into a vector, which I do in the first four lines. I used values that are close, but not the same, as the ones that I used to generate the data.
Then I define the function I want to minimize. It takes a single argument x, which it unpacks and feeds to the targetfunction, along with the points b in our dataset. Hopefully these are close to y. We measure how far they are from y by subtracting from y and applying the function norm, which squares every component, adds them up and takes the square root (i.e. it computes the root mean square error).
Then I call fminunc with our function to be minimized, and the initial guess for the parameters. This uses an internal routine to find the closest match for each of the parameters, and returns them in the vector x.
Finally, I unpack the parameters from the vector x.
Putting it all together
We now have all the components we need, so we just want one final function to tie them together. Here it is:
function master
%# Generate some data (you should read in your own data here)
[f b] = generateData(1000,1);
%# Find the best fitting parameters
[A V P bc] = bestfit(f,b);
%# Print them to the screen
fprintf('A = %f\n',A)
fprintf('V = %f\n',V)
fprintf('P = %f\n',P)
fprintf('bc = %f\n',bc)
%# Make plots of the data and the function we have fitted
plot(b,f,'.');
hold on
plot(sort(b),targetfunction(A,V,P,bc,sort(b)),'r','LineWidth',2)
end
If I run this function, I see this being printed to the screen:
>> master
Local minimum found.
Optimization completed because the size of the gradient is less than
the default value of the function tolerance.
A = 1.991727
V = 0.979819
P = 0.695265
bc = 0.067431
And the following plot appears:
That fit looks good enough to me. Let me know if you have any questions about anything I've done here.
I am a bit surprised as you mention f(a) and your function does not contain an a, but in general, suppose you want to plot f(x) = cos(x)^2
First determine for which values of x you want to make a plot, for example
xmin = 0;
stepsize = 1/100;
xmax = 6.5;
x = xmin:stepsize:xmax;
y = cos(x).^2;
plot(x,y)
However, note that this approach works just as well in excel, you just have to do some work to get your x values and function in the right cells.