Fitting a gaussian to data with Matlab - matlab

I want to produce a figure like the following one (found in a paper)
I think it is done using histfit
However, histfit doesen't really work with my data. The bars exceed the curve. My data is not really normally distributed but I want all the bins to be inside the curve except some outliers. Is there any way to fit a gaussian and plot it like in the above figure?
Edit
This is what histfit(data)has given
I want to fit a gaussian to it and keep some values as ouliers. I need to only use a normal distribution as it is going to be used in a Kalman filter based on the assumption that the data is normally distributed. The fact that is not really normally distributed will certainly affect the performance of the filter but I have to feed it first with the parameters of a normal distribution , i.e mean and std.

I'm not sure you understand how a fit works, if your data is kinda gaussian the function will plot the fitted curve based on the values, some bars will be above some below, it all depends on how the least squares are minimized over the entire curve. you can't force the fit to look different, this is the result of the fitting process. If your data is not normally distributed then the goodness of the fit is poor. without having more info or data, this is the best I can answer :)

Related

Matlab Image Histogram Analysis: how do I test for an underlying bimodal distribution?

I am working with image processing in MATLAB. I have two different images whose histogram plots are as shown below.
Image 1:
and
Image 2:
I have multiple images like those and the only distinguishing(separating) features is that some have single peak and others have two peaks.
In other words some can be thresholded (to generate good results) while others cannot. Is there any way I can separate the two images? Are there any functions that do so in MATLAB or any reference code that will help?
The function used is imhist()
If you mean "distinguish" by "separate", then yes: The property you describe is called bimodality, i.e. you have 2 peaks that can be seperated by one threshold. So your question is actually "how do I test for an underlying bimodal distribution?"
One option to do this programmatically is Binning. This is not the most robust method but the easiest. It might work, it might not.
Kernel Smoothing is probably the more robust solution. You basically shift and scale a certain function (e.g. Gaussian) to fit the data. This can be done with histfit in matlab.
There's more solutions for this problem which you can research for yourself since you now know the terms needed. Be aware though that your problem is not a trivial one if you want to do it properly.

Resampling data with minimal loss of information in time-domain

I am trying to resample/recreate already recorded data for plotting purposes. I thought this is best place to ask the question (besides dsp.se).
The data is sampled at high frequency, contains to much data points and not suitable for plotting in time domain (not enough memory). i want to sample it with minimal loss. The sampling interval of the resulting data doesn't need to be same (well it is again for plotting purposes, not analysis) although input data in equally sampled.
When we use the regular resample command from matlab/octave, it can distort stiff pieces of the curve.
What is the best approach here?
For reference I put two pictures found in tex.se)
First image is regular resample
Second image is a better resampled data that can well behave around peaks.
You should try this set of files from the File Exchange. It computes optimal lookup table based on either the maximum set of points or a given error. You can choose from natural, linear, or spline for the interpolation methods. Spline will have the smallest table size but is slower than linear. I don't use natural unless I have a really good reason.
Sincerely,
Jason

how to make a smooth plot in matlab

I have about 100 data points which mostly satisfying a certain function (but some points are off). I would like to plot all those points in a smooth curve but the problem is the points are not uniformly distributed. So is that anyway to get the smooth curve? I am thinking to interpolate some points in between, but the only way that comes up to my mind is to linearly insert some artificial points between two data points. But that will show a pretty weird shape (like some sharp corner). So any better idea? Thanks.
If you know more or less what the actual curve should be, you can try to fit that curve to your points (e.g. using polyfit). Depending on how many points are off and how far, you can get by with least squares regression (which is fairly easy to get working). If you have too many outliers (or they are much too large/small), you can also try robust regression (e.g. least absolute deviation fitting) using the robustfit function.
If you can manually determine the outliers, you can also fit a curve through the other points to get better results or even use interpolation methods (e.g. interp1 in MATLAB) on those points to get a smoother curve.
If you know which function describes your data, robust fitting (using, e.g. ROBUSTFIT, or the new convenient functions LINEARMODEL and NONLINEARMODEL with the robust option) is a good way to go if there are outliers in your data.
If you don't know the function that describes your data, but want a smooth trendline that is little affected by outliers, SMOOTHN from the File Exchange does an excellent job in my experience.
Have you looked at the use of smoothing splines? Like interpolating splines, but with the knot points and coefficients chosen to minimise a least-squares error function. There is an excellent implementation available from Matlab central which I have used successfully.

Fit A Curve to a Histogram

Is there any possibility to fit a curve to that histogram above in Matlab?
The histogram is not normalized or anything like that.
I know that there is a function called histfit,but can i use it here?
Try this FileExchange submission:
ALLFITDIST - Fit all valid parametric probability distributions to data.
--- UPDATE ---
ALLFITDIST is no longer available on the MATLAB File Exchange.
You can try this instead:
FITMETHIS - finds best-fitting distribution to data vector, including non-parametric.
If you know the underlying distribution (i.e. skewed gaussian etc.), you can manually do a maximum likelihood estimate for the parameters of the distribution and then plot the resulting distribution on top of your histogram. However, you need to normalize your histogram so that you see empirical probabilities instead of the numbers.
I think what you want it to fit a distribution, not any curve that might not have finite area under the curve. Data looks like it's censored on the right tail, but over all it may fit log normal distribution or Gamma distribution pretty well. If you have stats toolbox, try gamfit or lognfit for starter.
See also Kernel density estimation
http://en.wikipedia.org/wiki/Kernel_density

Analyzing data for noisy arrays

Using MATLAB I filtered a very noisy m x n array with a low-pass Gaussian filter, cleaned it up pretty well but still not well enough to analyze my data. What would the next step be? I'm thinking that signal enhancement, but am not sure how to go about this.
Update
Well, there are two different types of data sets actually; one is small peaks circular at base, around half a dozen pixels wide at base, noisy background with random noise. The other is the same thing but Gaussian and Poisson noise mainly. I tried filtering w/Gaussian low pass in both instances, worked to some extent as mentioned in the OP.
It is impossible to answer this without knowing what data you have, and what the noise is like.
Different problems will have different best solutions.