how to make a smooth plot in matlab - matlab

I have about 100 data points which mostly satisfying a certain function (but some points are off). I would like to plot all those points in a smooth curve but the problem is the points are not uniformly distributed. So is that anyway to get the smooth curve? I am thinking to interpolate some points in between, but the only way that comes up to my mind is to linearly insert some artificial points between two data points. But that will show a pretty weird shape (like some sharp corner). So any better idea? Thanks.

If you know more or less what the actual curve should be, you can try to fit that curve to your points (e.g. using polyfit). Depending on how many points are off and how far, you can get by with least squares regression (which is fairly easy to get working). If you have too many outliers (or they are much too large/small), you can also try robust regression (e.g. least absolute deviation fitting) using the robustfit function.
If you can manually determine the outliers, you can also fit a curve through the other points to get better results or even use interpolation methods (e.g. interp1 in MATLAB) on those points to get a smoother curve.

If you know which function describes your data, robust fitting (using, e.g. ROBUSTFIT, or the new convenient functions LINEARMODEL and NONLINEARMODEL with the robust option) is a good way to go if there are outliers in your data.
If you don't know the function that describes your data, but want a smooth trendline that is little affected by outliers, SMOOTHN from the File Exchange does an excellent job in my experience.

Have you looked at the use of smoothing splines? Like interpolating splines, but with the knot points and coefficients chosen to minimise a least-squares error function. There is an excellent implementation available from Matlab central which I have used successfully.

Related

Fitting a gaussian to data with Matlab

I want to produce a figure like the following one (found in a paper)
I think it is done using histfit
However, histfit doesen't really work with my data. The bars exceed the curve. My data is not really normally distributed but I want all the bins to be inside the curve except some outliers. Is there any way to fit a gaussian and plot it like in the above figure?
Edit
This is what histfit(data)has given
I want to fit a gaussian to it and keep some values as ouliers. I need to only use a normal distribution as it is going to be used in a Kalman filter based on the assumption that the data is normally distributed. The fact that is not really normally distributed will certainly affect the performance of the filter but I have to feed it first with the parameters of a normal distribution , i.e mean and std.
I'm not sure you understand how a fit works, if your data is kinda gaussian the function will plot the fitted curve based on the values, some bars will be above some below, it all depends on how the least squares are minimized over the entire curve. you can't force the fit to look different, this is the result of the fitting process. If your data is not normally distributed then the goodness of the fit is poor. without having more info or data, this is the best I can answer :)

Curve fitting, but I want to guarantee only one inflection point

I often find myself fitting a scatter plot, and knowing that the 'true fit' should have only one inflection point. Any ideas for forcing a fit that will obey this?
I am using Matlab and Microsoft Excel
Many thanks
Option 1:
I like to use spline smoothing with Akaike information criteria, and while it is a hyper-parametric fit and has a large number of analytic candidate inflection points, the smoothed data at the sample points tends to reveal only what is within the data.
If your data doesn't actually have an inflection point, this is indicated. If it does, it is also usually captured. Statistical jargon for an important cousin to this is called a "non-informative prior".
Try slides 30-31 here: link.
Option 2:
If you have an older version of MatLab then you can specify the exact model easily in the "cftool" (not the same as sftool) then get m-file that gives how you put it into your own script. Pick a model appropriate to your data.

Resampling data with minimal loss of information in time-domain

I am trying to resample/recreate already recorded data for plotting purposes. I thought this is best place to ask the question (besides dsp.se).
The data is sampled at high frequency, contains to much data points and not suitable for plotting in time domain (not enough memory). i want to sample it with minimal loss. The sampling interval of the resulting data doesn't need to be same (well it is again for plotting purposes, not analysis) although input data in equally sampled.
When we use the regular resample command from matlab/octave, it can distort stiff pieces of the curve.
What is the best approach here?
For reference I put two pictures found in tex.se)
First image is regular resample
Second image is a better resampled data that can well behave around peaks.
You should try this set of files from the File Exchange. It computes optimal lookup table based on either the maximum set of points or a given error. You can choose from natural, linear, or spline for the interpolation methods. Spline will have the smallest table size but is slower than linear. I don't use natural unless I have a really good reason.
Sincerely,
Jason

Data interpolation over a non-homogeneous surface

In my project i deal with big data surfaces.
In a certain point, i have a line across the data, and I need the values of the points of the line.
The grid is non,homogeneous, it doesnt go from n:m with fixed steps nor nothing.
Lets ilustrate!
In the figure the 2D proyection of my data can be seen. Each of the points has also other 3 data information. I defined a arbitrary red line with the form y=ax+b. a and b are known.
How can I define i.e. 50 points in the line that has not only the x and y coords (wich is straigforward) but also the interpolation of the 3 data information of each of the points around it.
I know is not an easy question but I can't seem to step forward even a bit.
PD: realize I DONT want code written for me, but the idea of how to achieve my objective.
You could use a tool like triScatteredInterp, which will triangulate the 2-d domain, then interpolate a list of points along your line. Griddata is also an option.
I have a toolbox for problems like this (of course.) It allows me to build a triangulation of the non-convex domain in the (x,y) plane. Then it can form a completely general slice through that surface, interpolating in z also as it does so. The result will be a 1-manifold, in this case a piecewise linear function along that path in (x,y,z). While those tools are not posted on the file exchange, they are available for the person willing to invest the time to learn to use them.
If the surface you describe is a completely general one in 3-d, that might be fairly complex, then you might need a CRUST based tool to define that surface triangulation. These can be found online too. Once a triangulation is available, my tools can then be used to slice them. (Sorry, I never did finish that piece.)
What I did was to define several points in the crack line and then cheack for each one of them in wich quadrilateral it is with inpoligon matlab function (no tthe fastest way but less than 2 secs).
Then I created a triangular plane in the used quadrilaterals using x,y and Z or the othre data , achieving a linear interpolation between the data.
finally i take out all the points that are 0 o Nan.

How to fix ROC curve with points below diagonal?

I am building receiver operating characteristic (ROC) curves to evaluate classifiers using the area under the curve (AUC) (more details on that at end of post). Unfortunately, points on the curve often go below the diagonal. For example, I end up with graphs that look like the one here (ROC curve in blue, identity line in grey) :
The the third point (0.3, 0.2) goes below the diagonal. To calculate AUC I want to fix such recalcitrant points.
The standard way to do this, for point (fp, tp) on the curve, is to replace it with a point (1-fp, 1-tp), which is equivalent to swapping the predictions of the classifier. For instance, in our example, our troublesome point A (0.3, 0.2) becomes point B (0.7, 0.8), which I have indicated in red in the image linked to above.
This is about as far as my references go in treating this issue. The problem is that if you add the new point into a new ROC (and remove the bad point), you end up with a nonmonotonic ROC curve as shown (red is the new ROC curve, and dotted blue line is the old one):
And here I am stuck. How can I fix this ROC curve?
Do I need to re-run my classifier with the data or classes somehow transformed to take into account this weird behavior? I have looked over a relevant paper, but if I am not mistaken, it seems to be addressing a slightly different problem than this.
In terms of some details: I still have all the original threshold values, fp values, and tp values (and the output of the original classifier for each data point, an output which is just a scalar from 0 to 1 that is a probability estimate of class membership). I am doing this in Matlab starting with the perfcurve function.
Note based on some very helpful emails about this from the people that wrote the articles cited above, and the discussion above, the right answer seems to be: do not try to "fix" individual points in an ROC curve unless you build an entirely new classifier, and then be sure to leave out some test data to see if that was a reasonable thing to do.
Getting points below the identity line is something that simply happens. It's like getting an individual classifier that scores 45% correct even though the optimal theoretical minimum is 50%. That's just part of the variability with real data sets, and unless it is significantly less than expected based on chance, it isn't something you should worry too much about. E.g., if your classifier gets 20% correct, then clearly something is amiss and you might look into the specific reasons and fix your classifier.
Yes, swapping a point for (1-fp, 1-tp) is theoretically effective, but increasing sample size is a safe bet too.
It does seem that your system has a non-monotonic response characteristic so be careful not to bend the rules of the ROC too much or you will impact the robustness of the AUC.
That said, you could try to use a Pareto Frontier Curve (Pareto Front). If that fits the requirements of "Repairing Concavities" then you'll basically sort the points so that the ROC curve becomes monotonic.