Curve fitting, but I want to guarantee only one inflection point - matlab

I often find myself fitting a scatter plot, and knowing that the 'true fit' should have only one inflection point. Any ideas for forcing a fit that will obey this?
I am using Matlab and Microsoft Excel
Many thanks

Option 1:
I like to use spline smoothing with Akaike information criteria, and while it is a hyper-parametric fit and has a large number of analytic candidate inflection points, the smoothed data at the sample points tends to reveal only what is within the data.
If your data doesn't actually have an inflection point, this is indicated. If it does, it is also usually captured. Statistical jargon for an important cousin to this is called a "non-informative prior".
Try slides 30-31 here: link.
Option 2:
If you have an older version of MatLab then you can specify the exact model easily in the "cftool" (not the same as sftool) then get m-file that gives how you put it into your own script. Pick a model appropriate to your data.

Related

What is the official Matlab way to plot the values of histcounts into a histogram with any normalization option?

Assume that I have an array of counts (ideally returned by histcounts). Is there an official Matlab way to plot such a histogram with all the standard normalization options available?
It seems that the best suggestion I have is to get the counts from histcounts and then plot them with bar. Something like:
edges = linspace(0,bound,nbins);
hist_c = histcounts(X,nbins);
bar(edges(1:nbins-1),hist_c);
unfortunately as far as I know it seems that using bar is really not recommended according to this link. Probably because as its obvious from the code, it seems that it moves a lot of implementation details into user code (like produces edges array manually when only needing nbins or having to know if to use 1:nbins-1 vs 2:nbins).
Furthermore, which I believe is the worst, is that it leave the user to have to implement the normalization options on its own. One may point out that histcounts can do the normalization options for you, however, it can only do them given the data matrix X. If one had an extremely large matrix X, then one would be in trouble because producing the histogram counts of X could be done on the fly (as done in this question) but the other normalization options could not be easily be done on the fly. One practice the user could try to implement each normalization option as described by the equations in the documentation but it seems extremely inefficient to have users implement this by hand. Is there a way to get access to the code that actually performs this normalization?
In reality what my question is going for is, is there an official matlab way to produce histogram having only the histogram counts? In particular hiding all the implementation details of producing the counts, normalization, binning, edges, etc?
The ideal code in my mind should look like this to the user:
histogram_counts = get_hist_count(X)
plot_histogram(histogram_counts,'Normalization',normalization)
and produces the desired histogram plot.
Related question:
https://www.mathworks.com/matlabcentral/answers/332178-how-does-one-plot-a-histogram-from-the-histogram-counts
https://www.mathworks.com/matlabcentral/answers/275278-what-is-the-recommended-practice-for-plotting-the-outputs-of-histcounts
https://www.mathworks.com/matlabcentral/answers/91944-how-can-i-combine-the-options-histc-and-stack-in-a-bar-plot-in-matlab-7-4-r2007a#answer_101295

Automatically truncating a curve to discard outliers in matlab

I am generation some data whose plots are as shown below
In all the plots i get some outliers at the beginning and at the end. Currently i am truncating the first and the last 10 values. Is there a better way to handle this?
I am basically trying to automatically identify the two points shown below.
This is a fairly general problem with lots of approaches, usually you will use some a priori knowledge of the underlying system to make it tractable.
So for instance if you expect to see the pattern above - a fast drop, a linear section (up or down) and a fast rise - you could try taking the derivative of the curve and looking for large values and/or sign reversals. Perhaps it would help to bin the data first.
If your pattern is not so easy to define but you are expecting a linear trend you might fit the data to an appropriate class of curve using fit and then detect outliers as those whose error from the fit exceeds a given threshold.
In either case you still have to choose thresholds - mean, variance and higher order moments can help here but you would probably have to analyse existing data (your training set) to determine the values empirically.
And perhaps, after all that, as Shai points out, you may find that lopping off the first and last ten points gives the best results for the time you spent (cf. Pareto principle).

Resampling data with minimal loss of information in time-domain

I am trying to resample/recreate already recorded data for plotting purposes. I thought this is best place to ask the question (besides dsp.se).
The data is sampled at high frequency, contains to much data points and not suitable for plotting in time domain (not enough memory). i want to sample it with minimal loss. The sampling interval of the resulting data doesn't need to be same (well it is again for plotting purposes, not analysis) although input data in equally sampled.
When we use the regular resample command from matlab/octave, it can distort stiff pieces of the curve.
What is the best approach here?
For reference I put two pictures found in tex.se)
First image is regular resample
Second image is a better resampled data that can well behave around peaks.
You should try this set of files from the File Exchange. It computes optimal lookup table based on either the maximum set of points or a given error. You can choose from natural, linear, or spline for the interpolation methods. Spline will have the smallest table size but is slower than linear. I don't use natural unless I have a really good reason.
Sincerely,
Jason

Multi-parametric regression in MATLAB?

I have a curve which looks roughly / qualitative like the curves displayed in those 3 images.
The only thing I know is that the first part of the curve is hardware-specific supposed to be a linear curve and the second part is some sort of logarithmic part (might be a combination of two logarithmic curves), i.e. linlog camera. But I couldn't tell the mathematic structure of the equation, e.g. wether it looks like a*log(b)+c or a*(log(c+b))^2 etc. Is there a way to best fit/find out a good regression for this type of curve and is there a certain way to do this specifically in MATLAB? :-) I've got the student version, i.e. all toolboxes etc.
fminsearch is a very general way to find best-fit parameters once you have decided on a parametric equation. And the optimization toolbox has a range of more-sophisticated ways.
Comparing the merits of one parametric equation against another, however, is a deep topic. The main thing to be aware of is that you can always tweak the equation, adding another term or parameter or whatever, and get a better fit in terms of lower sum-squared-error or whatever other goodness-of-fit metric you decide is appropriate. That doesn't mean it's a good thing to keep adding parameters: your solution might be becoming overly complex. In the end the most reliable way to compare how well two different parametric models are doing is to cross-validate: optimize the parameters on a subset of the data, and evaluate only on data that the optimization procedure has not yet seen.
You can try the "function finder" on my curve fitting web site zunzun.com and see what it comes up with - it is free. If you have any trouble please email me directly and I'll do my best to help.
James Phillips
zunzun#zunzun.com

how to make a smooth plot in matlab

I have about 100 data points which mostly satisfying a certain function (but some points are off). I would like to plot all those points in a smooth curve but the problem is the points are not uniformly distributed. So is that anyway to get the smooth curve? I am thinking to interpolate some points in between, but the only way that comes up to my mind is to linearly insert some artificial points between two data points. But that will show a pretty weird shape (like some sharp corner). So any better idea? Thanks.
If you know more or less what the actual curve should be, you can try to fit that curve to your points (e.g. using polyfit). Depending on how many points are off and how far, you can get by with least squares regression (which is fairly easy to get working). If you have too many outliers (or they are much too large/small), you can also try robust regression (e.g. least absolute deviation fitting) using the robustfit function.
If you can manually determine the outliers, you can also fit a curve through the other points to get better results or even use interpolation methods (e.g. interp1 in MATLAB) on those points to get a smoother curve.
If you know which function describes your data, robust fitting (using, e.g. ROBUSTFIT, or the new convenient functions LINEARMODEL and NONLINEARMODEL with the robust option) is a good way to go if there are outliers in your data.
If you don't know the function that describes your data, but want a smooth trendline that is little affected by outliers, SMOOTHN from the File Exchange does an excellent job in my experience.
Have you looked at the use of smoothing splines? Like interpolating splines, but with the knot points and coefficients chosen to minimise a least-squares error function. There is an excellent implementation available from Matlab central which I have used successfully.