The documentation I am reading for curve fitting doesnt seem to be that complicated, but clearly I am doing something wrong.
Given x,y data, trying to fit a polynomial function of degree 3, the data points are seemingly ignored. Can someone explain both how to fix this, and what this curve is actually calculating, as opposed to what I think it should be calculating?
Data: http://pastebin.com/4EXu0FSv
You most likely are using the wrong regression model or interval (or points). Curve fitting is very very complex topic and can not be simply solved. Have a read of the Mathworks site about the curve fitting toolbox here .
However I would not be fitting a 3 order polynomial to this data. I would be more inclinded to fit positive reciprocal function - see if that gives you a better fit.
Related
I want to produce a figure like the following one (found in a paper)
I think it is done using histfit
However, histfit doesen't really work with my data. The bars exceed the curve. My data is not really normally distributed but I want all the bins to be inside the curve except some outliers. Is there any way to fit a gaussian and plot it like in the above figure?
Edit
This is what histfit(data)has given
I want to fit a gaussian to it and keep some values as ouliers. I need to only use a normal distribution as it is going to be used in a Kalman filter based on the assumption that the data is normally distributed. The fact that is not really normally distributed will certainly affect the performance of the filter but I have to feed it first with the parameters of a normal distribution , i.e mean and std.
I'm not sure you understand how a fit works, if your data is kinda gaussian the function will plot the fitted curve based on the values, some bars will be above some below, it all depends on how the least squares are minimized over the entire curve. you can't force the fit to look different, this is the result of the fitting process. If your data is not normally distributed then the goodness of the fit is poor. without having more info or data, this is the best I can answer :)
I wanted to extrapolate some of the data I had, as shown in the plot below. The blue line is the original data and the red line is the extrapolation that I wanted.
To use regression analysis, I used the function polyfit:
sizespecial = size(i_C);
endgoal = sizespecial(2);
plothelp = 1:endgoal;
reg1 = polyfit(plothelp,i_C,2);
reg2 = polyfit(plothelp,i_D,2);
Where i_C and i_D are the vectors that represent the original data. I extended the data by using this code:
plothelp=1:endgoal+11;
for in = endgoal+1:endgoal+11
i_C(in) = (reg1(1)*(in^2))+(reg1(2)*in)+reg1(3);
i_D(in) = (reg2(1)*(in^2))+(reg2(2)*in)+reg2(3);
end
However, the graph I output now is:
I do not understand why the extra notch is introduced (circled in red). Do not hesitate to ask me to clarify any of the details on this questions and thank you for all your answers.
What I imagine is happening is that you are trying fit a second order polynomial over all your data. My guess is that this polynomial will look a lot like the curve I have drawn in in orange. If you follow Matt's advise from his comment and plot your regressed polynomial over the your original data as well (not just the extrapolated part) you should confirm this.
You might get better results by fitting a higher order polynomial. Your data have two points of inflection so a 3rd order polynomial will probably work quite well. One danger of extrapolating on higher order polynomial however is that they could have fairly dramatic inflections outside of the domain of your data and produce unexpected and wild results.
One way to mitigate against this is by rather performing a linear regression over the final x data points of your series. These are the points highlighted in yellow in the figure. You can tune x as a parameter such that it covers as much of the approximately linear final portion of your curve as makes sense. The red line I have drawn in will be the result of a linear regression performed on only those data (as opposed to the entire data set)
Another option might be to rather fit a spline curve and extrapolate on that. You can use the interp1 function specifying 'spline' or 'pchip' for that.
However which is the best choice will depend largely on the nature of the problem you are trying to solve.
I have some experimental data from an oscillating system (time domaine) and I would like to get an approximation of the damping ratio (zeta). I have already try to use the half-power band width method with the vibrationdata Matlab package. But I would like to compare the result with another method.
I found several methods in this paper : www.vce.at/sites/default/files/uploads/downloads/2007_damping_estimation.pdf and I would like to try to do the curve fitting method with the Curve Fitting Toolbox of Matlab (R2014b) (chapter 2.2.2 Curve fitting in the paper).
It's the first time I use this Toolbox with a custom equation, so, I do not really know how to do it with my sample data.
I do not know how to write in the Curve Fitting Toolbow, the equation of the paper as it involves differentiate equation...
Can anyone help me regarding this.
I have about 100 data points which mostly satisfying a certain function (but some points are off). I would like to plot all those points in a smooth curve but the problem is the points are not uniformly distributed. So is that anyway to get the smooth curve? I am thinking to interpolate some points in between, but the only way that comes up to my mind is to linearly insert some artificial points between two data points. But that will show a pretty weird shape (like some sharp corner). So any better idea? Thanks.
If you know more or less what the actual curve should be, you can try to fit that curve to your points (e.g. using polyfit). Depending on how many points are off and how far, you can get by with least squares regression (which is fairly easy to get working). If you have too many outliers (or they are much too large/small), you can also try robust regression (e.g. least absolute deviation fitting) using the robustfit function.
If you can manually determine the outliers, you can also fit a curve through the other points to get better results or even use interpolation methods (e.g. interp1 in MATLAB) on those points to get a smoother curve.
If you know which function describes your data, robust fitting (using, e.g. ROBUSTFIT, or the new convenient functions LINEARMODEL and NONLINEARMODEL with the robust option) is a good way to go if there are outliers in your data.
If you don't know the function that describes your data, but want a smooth trendline that is little affected by outliers, SMOOTHN from the File Exchange does an excellent job in my experience.
Have you looked at the use of smoothing splines? Like interpolating splines, but with the knot points and coefficients chosen to minimise a least-squares error function. There is an excellent implementation available from Matlab central which I have used successfully.
I have a problem where I am fitting a high-order polynomial to (not very) noisy data using linear least squares. Currently I'm using polynomial orders around 15 - 25, which work surprisingly well: The dependence is very nearly linear, but the accuracy of modelling the 'very nearly' is critical. I'm using Matlab's polyfit() function, and (obviously) normalising the x-data. This generally works fine, but I have come across an issue with some recent datasets. The fitted polynomial has extrema within the x-data interval. For the application I'm working on this is a non-no. The polynomial model must have no stationary points over the x-interval.
So I need to add a constraint to the least-squares problem: the derivative of the fitted polynomial must be strictly positive over a known x-range (or strictly negative - this depends on the data but a simple linear fit will quickly tell me which it is.) I have had a quick look at the available optimisation toolbox functions, but I admit I'm at a loss to know how to go about this. Does anyone have any suggestions?
[I appreciate there are probably better models than polynomials for this data, but in the short term it isn't feasible to change the form of the model]
[A closing note: I have finally got the go-ahead to replace this awful polynomial model! I am going to adopt a nonparametric approach, spline smoothing, using the excellent SPLINEFIT code by Jonas Lundgren. This has the advantage that I'm already using a spline model in the end-user application, so I already have C# code available to evaluate a spline model]
You could use cftool and use the exclude data points option.