I am a Matlab amateur so please bear with me -
I currently use Matlab to fit a complex equation to two-dimensional data. Right now I have a program which uses f = fit(xdata, ydata, function, options) to generate a fit object.
I can then use confint(f) and f.parameter etc. to get the fitted coefficients and confidence intervals, and I can use plot(f,x,y) to make a plot of the data and the fit.
From that point on the only way I know how to get the points which were plotted is to use the brush(?) tool and select all of the line, then copy the data to clipboard and paste it into excel or some such thing. I would much rather get those points directly from Matlab, perhaps into an array, but I have no idea how.
Can any MatLab veteran tell me if what I want is even possible? It would be very difficult due to the complexity of my equation to plot those points myself, but I will if need be (it can take ~30 minutes to do this fit and my computer is no slouch).
Since you have not shared any code or data, I use an example from MATLAB documentation:
load franke
f = fit([x, y],z,'poly23');
plot(f,[x,y],z)
So as you can sew, it first loads a dataset including x,y,z vectors. Then fits a surface to the data using 'poly23'. In your case it can be different sets of vectors and function and still as you said you will get the f function.
Now we can take a look at the function f
>> f
Linear model Poly23:
f(x,y) = p00 + p10*x + p01*y + p20*x^2 + p11*x*y + p02*y^2 + p21*x^2*y
+ p12*x*y^2 + p03*y^3
Coefficients (with 95% confidence bounds):
p00 = 1.118 (0.9149, 1.321)
p10 = -0.0002941 (-0.000502, -8.623e-05)
p01 = 1.533 (0.7032, 2.364)
p20 = -1.966e-08 (-7.084e-08, 3.152e-08)
p11 = 0.0003427 (-0.0001009, 0.0007863)
p02 = -6.951 (-8.421, -5.481)
p21 = 9.563e-08 (6.276e-09, 1.85e-07)
p12 = -0.0004401 (-0.0007082, -0.0001721)
p03 = 4.999 (4.082, 5.917)
It shows you the form of the function and the coefficients. So you can use it as follows:
zz = f(x,y);
To make sure you can plot the data again:
figure;
scatter3(x,y,zz,'.k');
hold on
scatter3(x,y,z,'.');
When you call f = fit(xdata, ydata, function, options), function name decides the equation. See list of official equations.
Simply iterate through data points and compute results using corresponding polynomial. So in your case lets say if function=poly2 you'll be doing computation as follows:
#Fit your data
f = fit([xdata, ydata],'poly2');
#Print name of coefficients (Just for Verification)
coeffnames(f)
#Fetch values of coefficients like p1, p2, ...
p = coeffvalues(c)
#Compute output points from min(xdata) to max(xdata) spaced at deltaX
deltaX = 0.1;
x = [min(xdata):deltaX:max(xdata)];
Y = p(1)*x^2+p(2)*x+p(3); #This is equation for function
I understand there can be alternate complex Java code to iterate through objects on a matlab figure and plot its value but using equations is a quick and valid approach.
Related
I am looking to fit a parabola to the following data.
x = [-10:2:16];
y = [0.0334,0.0230,0.0145,0.0079,0.0033,0.0009,0.0006,0.0026,0.0067,0.0130,0.0213,0.0317,0.0440,0.0580];
[p,~,~] = polyfit(x,y,2);
x2 = linspace(-10,16,100);
y2 = polyval(p,x2);
y3 = 0.0003.*x2.^2 -0.0006.*x2 + 0.0011;
figure
plot(x,y,'o',x2,y2,x2,y3)
However, the fit does not match with the data at all. After putting the data into excel and fitting using a 2nd order polynomial there, I get a very nice fit. y = 0.0003x2 - 0.0006x + 0.0011 (excel truncating the coefficients skews the fit a bit). What is happening with polyfit with this data?
Solved.
Matlab checks how many outputs the user is requesting. Since I requested three outputs even though I wasn't using them, polyfit changes the coefficients to map to a different domain xhat.
If I instead just did:
p = polyfit(x,y,2);
plot(x2,polyval(p,x2));
Then I would achieve the appropriate result. To recover the same answer using the three outputs:
[p2,S,mu] = polyfit(x,y,2);
xhat = (x2-mu(1))./mu(2)
y4 = polyval(p2,xhat)
plot(x2,y4)
I would solve this in matlab using least squares:
x = [-10:2:16]';
Y = [0.0334,0.0230,0.0145,0.0079,0.0033,0.0009,0.0006,0.0026,0.0067,0.0130,0.0213,0.0317,0.0440,0.0580]';
plot(x,Y,'.');
A=[ones(length(x),1) x x.^2];
beta=A\Y;
hold on
plot(x, beta(1)+beta(2)*x+beta(3)*x.^2)
leg_est=sprintf('Estimated (y=%.4f+%.4fx+%.4fx^2',beta(1),beta(2),beta(3))
legend('Data',leg_est)
I'm trying to find the line which best fits to the data. I use the following code below but now I want to have the data placed into an array sorted so it has the data which is closest to the line first how can I do this? Also is polyfit the correct function to use for this?
x=[1,2,2.5,4,5];
y=[1,-1,-.9,-2,1.5];
n=1;
p = polyfit(x,y,n)
f = polyval(p,x);
plot(x,y,'o',x,f,'-')
PS: I'm using Octave 4.0 which is similar to Matlab
You can first compute the error between the real value y and the predicted value f
err = abs(y-f);
Then sort the error vector
[val, idx] = sort(err);
And use the sorted indexes to have your y values sorted
y2 = y(idx);
Now y2 has the same values as y but the ones closer to the fitting value first.
Do the same for x to compute x2 so you have a correspondence between x2 and y2
x2 = x(idx);
Sembei Norimaki did a good job of explaining your primary question, so I will look at your secondary question = is polyfit the right function?
The best fit line is defined as the line that has a mean error of zero.
If it must be a "line" we could use polyfit, which will fit a polynomial. Of course, a "line" can be defined as first degree polynomial, but first degree polynomials have some properties that make it easy to deal with. The first order polynomial (or linear) equation you are looking for should come in this form:
y = mx + b
where y is your dependent variable and X is your independent variable. So the challenge is this: find the m and b such that the modeled y is as close to the actual y as possible. As it turns out, the error associated with a linear fit is convex, meaning it has one minimum value. In order to calculate this minimum value, it is simplest to combine the bias and the x vectors as follows:
Xcombined = [x.' ones(length(x),1)];
then utilized the normal equation, derived from the minimization of error
beta = inv(Xcombined.'*Xcombined)*(Xcombined.')*(y.')
great, now our line is defined as Y = Xcombined*beta. to draw a line, simply sample from some range of x and add the b term
Xplot = [[0:.1:5].' ones(length([0:.1:5].'),1)];
Yplot = Xplot*beta;
plot(Xplot, Yplot);
So why does polyfit work so poorly? well, I cant say for sure, but my hypothesis is that you need to transpose your x and y matrixies. I would guess that that would give you a much more reasonable line.
x = x.';
y = y.';
then try
p = polyfit(x,y,n)
I hope this helps. A wise man once told me (and as I learn every day), don't trust an algorithm you do not understand!
Here's some test code that may help someone else dealing with linear regression and least squares
%https://youtu.be/m8FDX1nALSE matlab code
%https://youtu.be/1C3olrs1CUw good video to work out by hand if you want to test
function [a0 a1] = rtlinreg(x,y)
x=x(:);
y=y(:);
n=length(x);
a1 = (n*sum(x.*y) - sum(x)*sum(y))/(n*sum(x.^2) - (sum(x))^2); %a1 this is the slope of linear model
a0 = mean(y) - a1*mean(x); %a0 is the y-intercept
end
x=[65,65,62,67,69,65,61,67]'
y=[105,125,110,120,140,135,95,130]'
[a0 a1] = rtlinreg(x,y); %a1 is the slope of linear model, a0 is the y-intercept
x_model =min(x):.001:max(x);
y_model = a0 + a1.*x_model; %y=-186.47 +4.70x
plot(x,y,'x',x_model,y_model)
I'm trying to fit an exponential curve to data sets containing damped harmonic oscillations. The data is a bit complicated in the sense that the sinusoidal oscillations contain many frequencies as seen below:
I need to find the rate of decay in the data. The method I am using can be found here. How it works, is it takes the log of the y values above the steady state value and then uses:
lsqlin(A,y1(:),-A,-y1(:),[],[],[],[],[],optimset('algorithm','active-set','display','off'))
To fit it.
However, this results in the following data fits:
I tried using a linear regression fit which obviously didn't work because it took the average. I also tried RANSAC thinking that there is more data near the peaks. It worked a bit better than the linear regression but the method is flawed as there are times when more points exist at the wrong regions.
Does anyone know of a good method to just fit the peaks for this data?
Currently, I'm thinking of dividing the 500 data points into 10 different regions and in each region find the largest value. At the end, I should have 50 points that I can fit using any of the exponential fitting methods mentioned above. What do you think of this method?
Thought I'd give everyone an update of potential solutions that may work. As mentioned earlier, the data is complicated by the varying sinusoidal frequencies, so certain methods may not work because of this. The methods listed below can be good depending on the data and the frequencies involved.
First off, I assume that the data has the form:
y = average + b*e^-(c*x)
In my case, the average is 290 so we have:
y = 290 + b*e^-(c*x)
With that being said, let's dive into the different methods that I tried:
findpeaks() Method
This is the method that Alexander Büse suggested. It's a pretty good method for most data, but for my data, since there's multiple sinusoidal frequencies, it gets the wrong peaks. The red x's show the peaks.
% Find Peaks Method
[max_num,max_ind] = findpeaks(y(ind));
plot(max_ind,max_num,'x','Color','r'); hold on;
x1 = max_ind;
y1 = log(max_num-290);
coeffs = polyfit(x1,y1,1)
b = exp(coeffs(2));
c = coeffs(1);
RANSAC
RANSAC is good if you have most of your data at the peaks. You see that in mine, because of the multiple frequencies, more peaks exist near the top. However, the problem with my data is that not all the data sets are like this. Hence, it occasionally worked.
% RANSAC Method
ind = (y > avg);
x1 = x(ind);
y1 = log(y(ind) - avg);
iterNum = 300;
thDist = 0.5;
thInlrRatio = .1;
[t,r] = ransac([x1;y1'],iterNum,thDist,thInlrRatio);
k1 = -tan(t);
b1 = r/cos(t);
% plot(x1,k1*x1+b1,'r'); hold on;
b = exp(b1);
c = k1;
Lsqlin Method
This method is the one used here. It uses Lsqlin to constrain the system. However, it seems to ignore the data in the middle. Depending on your data set, this could work really well as it did for the person in the original post.
% Lsqlin Method
avg = 290;
ind = (y > avg);
x1 = x(ind);
y1 = log(y(ind) - avg);
A = [ones(numel(x1),1),x1(:)]*1.00;
coeffs = lsqlin(A,y1(:),-A,-y1(:),[],[],[],[],[],optimset('algorithm','active-set','display','off'));
b = exp(coeffs(2));
c = coeffs(1);
Find Peaks in Period
This is the method I mentioned in my post where I get the peak in each region, . This method works pretty well and from this I realized that my data may not actually have a perfect exponential fit. We see that it is unable to fit the large peaks at the beginning. I was able to make this a bit better by only using the first 150 data points and ignoring the steady state data points. Here I found the peak every 25 data points.
% Incremental Method 2 Unknowns
x1 = [];
y1 = [];
max_num=[];
max_ind=[];
incr = 25;
for i=1:floor(size(y,1)/incr)
[max_num(end+1),max_ind(end+1)] = max(y(1+incr*(i-1):incr*i));
max_ind(end) = max_ind(end) + incr*(i-1);
if max_num(end) > avg
x1(end+1) = max_ind(end);
y1(end+1) = log(max_num(end)-290);
end
end
plot(max_ind,max_num,'x','Color','r'); hold on;
coeffs = polyfit(x1,y1,1)
b = exp(coeffs(2));
c = coeffs(1);
Using all 500 data points:
Using the first 150 data points:
Find Peaks in Period With b Constrained
Since I want it to start at the first peak, I constrained the b value. I know the system is y=290+b*e^-c*x and I constrain it such that b=y(1)-290. By doing so, I just need to solve for c where c=(log(y-290)-logb)/x. I can then take the average or median of c. This method is quite good as well, it doesn't fit the value near the end as well but that isn't as big of a deal since the change there is minimal.
% Incremental Method 1 Unknown (b is constrained y(1)-290 = b)
b = y(1) - 290;
c = [];
max_num=[];
max_ind=[];
incr = 25;
for i=1:floor(size(y,1)/incr)
[max_num(end+1),max_ind(end+1)] = max(y(1+incr*(i-1):incr*i));
max_ind(end) = max_ind(end) + incr*(i-1);
if max_num(end) > avg
c(end+1) = (log(max_num(end)-290)-log(b))/max_ind(end);
end
end
c = mean(c); % Or median(c) works just as good
Here I take the peak for every 25 data points and then take the mean of c
Here I take the peak for every 25 data points and then take the median of c
Here I take the peak for every 10 data points and then take the mean of c
If the main goal is to extract the damping parameter from the fit, maybe you want to consider fitting directly a damped sine curve to your data. Something like this (created with the curve fitting tool):
[xData, yData] = prepareCurveData( x, y );
ft = fittype( 'a + sin(b*x - c).*exp(d*x)', 'independent', 'x', 'dependent', 'y' );
opts = fitoptions( 'Method', 'NonlinearLeastSquares' );
opts.Display = 'Off';
opts.StartPoint = [1 0.285116122712545 0.805911873245316 0.63235924622541];
[fitresult, gof] = fit( xData, yData, ft, opts );
plot( fitresult, xData, yData );
Especially since some of your example data really don't have many data points in the interesting region (above the noise).
If however, you really need to fit directly to maxima of the experimental data, you could use the findpeaks function to select only the maxima and then fit to them. You may want to play a bit with the MinPeakProminence parameter to adjust it to your needs.
I use the following example taken from the site that describes fit. Returns an
object. Is it possible to take the data that refer to the fitted surface?
load franke
sf = fit([x, y],z,'poly23')
plot(sf,[x,y],z)
Thanks!
Here is a way to do it; but there is probably a cleaner one:
After getting your sf object, you can access its methods like so:
MethodName(sf)
See here for the list of available methods
So let's say you wish to plot the surface using a handle for the plot:
hPlot = plot(sf)
Then you can fetch the XData, YData and ZData using the handles like so:
X = get(hPlot,'XData')
Y = get(hPlot,'YData')
Z = get(hPlot,'ZData')
Might be a it cumbersome but it works. Note that you can also fetch the coefficients of the fitted surface like so:
coeffvalues(sf)
and the formula used to generate it:
formula(sf)
Therefore you could generate X, Y data and create Z values using meshgrid for example and you could then modify the surface as you wish.
EDIT Here is how you could create your own surface using the coefficients and the formula. Here I create an anonymous function with two input arguments (x and y) and use it to generate z values to plot. From the data obtained using plot(sf) I used x = 1:0.01:0.01:1 and y = 500:500:3000 but you could obviously change them.
I entered the formula manually in the function handle but there has to be a better way; I'm a bit in a hurry so I did not looked further into that but you could extract every element of the formula and multiply it by the right coefficient to automatically generate the formula.
Here is the whole code:
clear
clc
close all
load franke
sf = fit([x, y],z,'poly23')
c = coeffvalues(sf)
F = formula(sf)
%// Generate x and y values.
[x,y] = meshgrid(500:100:3000,0.01:.01:1);
%// There should be a better approach than manually entering the data haha.
%// Maybe use eval or feval.
MyFun = #(x,y) (c(1) + c(2)*x + c(3)*y +c(4)*x.^2 + c(5)*x.*y + c(6)*y.^2 + c(7)*(x.^2).*y + c(8)*x.*y.^2 + c(9)*y.^3);
%// Generate z data to create a surface
z = (MyFun(x,y));
figure
subplot(1,2,1)
plot(sf)
title('Plot using sf','FontSize',18)
subplot(1,2,2)
surf(x,y,z)
title('Plot using MyFun','FontSize',18)
Output:
[I know this is many years later, but I thought I'd add this for future visitors looking for answers]
There is a much simpler way to do this - use feval, or the implicit shortcuts to it by calling the fittype object itself. It did take me a little digging to find this when I needed it, it isn't particularly obvious.
From the feval MATLAB documentation:
Standard feval syntax:
y = feval(cfun,x)
z = feval(sfun,[x,y])
z = feval(sfun,x,y)
y = feval(ffun,coeff1,coeff2,...,x)
z = feval(ffun,coeff1,coeff2,...,x,y)
You can use feval to evaluate fits, but the following simpler syntax is recommended to evaluate these objects, instead of calling feval directly. You can treat fit objects as functions and call feval indirectly using the following syntax:
y = cfun(x) % cfit objects;
z = sfun(x,y) % sfit objects
z = sfun([x, y]) % sfit objects
y = ffun(coef1,coef2,...,x) % curve fittype objects;
z = ffun(coef1,coef2,...,x,y) % surface fittype objects;
I have a data-set which is loaded into matlab. I need to do exponential fitting for the plotted curve without using the curve fitting tool cftool.
I want to do this manually through executing a code/function that will output the values of a and b corresponding to the equation:
y = a*exp(b*x)
Then be using those values, I will do error optimization and create the best fit for the data I have.
Any help please?
Thanks in advance.
Try this...
f = fit(x,y,'exp1');
I think the typical objective in this type of assignment is to recognize that by taking the log of both sides, various methods of polynomial fit approaches can be used.
ln(y) = ln(a) + ln( exp(x).^b )
ln(y) = ln(a) + b * ln( exp(x) )
There can be difficulties with this approach when errors such as noise are involved due to the behavior of ln as it approaches zero.
In this exercise I have a set of data that present an exponential curve and I want to fit them exponentially and get the values of a and b. I used the following code and it worked with the data I have.
"trail.m" file:
%defining the data used
load trialforfitting.txt;
xdata= trialforfitting(:,1);
ydata= trialforfitting(:,2);
%calling the "fitcurvedemo" function
[estimates, model] = fitcurvedemo(xdata,ydata)
disp(sse);
plot(xdata, ydata, 'o'); %Data curve
hold on
[sse, FittedCurve] = model(estimates);
plot(xdata, FittedCurve, 'r')%Fitted curve
xlabel('Voltage (V)')
ylabel('Current (A)')
title('Exponential Fitting to IV curves');
legend('data', ['Fitting'])
hold off
"fitcurvedemo.m" file:
function [estimates, model] = fitcurvedemo(xdata, ydata)
%Call fminsearch with a random starting point.
start_point = rand(1, 2);
model = #expfun;
estimates = fminsearch(model, start_point);
%"expfun" accepts curve parameters as inputs, and outputs
%the sum of squares error [sse] expfun is a function handle;
%a value that contains a matlab object methods and the constructor
%"FMINSEARCH" only needs sse
%estimate returns the value of A and lambda
%model computes the exponential function
function [sse, FittedCurve] = expfun(params)
A = params(1);
lambda = params(2);
%exponential function model to fit
FittedCurve = A .* exp(lambda * xdata);
ErrorVector = FittedCurve - ydata;
%output of the expfun function [sum of squares of error]
sse = sum(ErrorVector .^ 2);
end
end
I have a new set of data that doesn't work with this code and give the appropriate exponential fit for the data curve plotted.