What is the official Matlab way to plot the values of histcounts into a histogram with any normalization option? - matlab

Assume that I have an array of counts (ideally returned by histcounts). Is there an official Matlab way to plot such a histogram with all the standard normalization options available?
It seems that the best suggestion I have is to get the counts from histcounts and then plot them with bar. Something like:
edges = linspace(0,bound,nbins);
hist_c = histcounts(X,nbins);
bar(edges(1:nbins-1),hist_c);
unfortunately as far as I know it seems that using bar is really not recommended according to this link. Probably because as its obvious from the code, it seems that it moves a lot of implementation details into user code (like produces edges array manually when only needing nbins or having to know if to use 1:nbins-1 vs 2:nbins).
Furthermore, which I believe is the worst, is that it leave the user to have to implement the normalization options on its own. One may point out that histcounts can do the normalization options for you, however, it can only do them given the data matrix X. If one had an extremely large matrix X, then one would be in trouble because producing the histogram counts of X could be done on the fly (as done in this question) but the other normalization options could not be easily be done on the fly. One practice the user could try to implement each normalization option as described by the equations in the documentation but it seems extremely inefficient to have users implement this by hand. Is there a way to get access to the code that actually performs this normalization?
In reality what my question is going for is, is there an official matlab way to produce histogram having only the histogram counts? In particular hiding all the implementation details of producing the counts, normalization, binning, edges, etc?
The ideal code in my mind should look like this to the user:
histogram_counts = get_hist_count(X)
plot_histogram(histogram_counts,'Normalization',normalization)
and produces the desired histogram plot.
Related question:
https://www.mathworks.com/matlabcentral/answers/332178-how-does-one-plot-a-histogram-from-the-histogram-counts
https://www.mathworks.com/matlabcentral/answers/275278-what-is-the-recommended-practice-for-plotting-the-outputs-of-histcounts
https://www.mathworks.com/matlabcentral/answers/91944-how-can-i-combine-the-options-histc-and-stack-in-a-bar-plot-in-matlab-7-4-r2007a#answer_101295

Related

Is there an effective way to fit the following two datasets with lsqcurvefit?

I have two complex datasets for which I intend to find a suitable function to fit them. The first dataset is presented as follows:
As you can see, although complicated, it seems that this dataset is a combination of rectangle functions. These data describe the relation of 'Amplitude' of complex numbers with time. The second picture looks like this:
And this relation actually describes the 'Phase' of the above complex numbers with time, it seems that they are also combinations of rectangle functions. At first, I want to use combinations of Fourier cosine and sine series to fit the amplitude and phase using
lsqcurvefit
in MATLAB, but it seems that the provided parameters fail to converge to the correct values. (I have tried a number of options, like adjusting FiniteDifferenceStepSize, FiniteDifferenceType, StepTolerance and so on). Despite many failures, I saw someone said we could use Normal cumulative distribution function (CDF) to fit a step function, and I thought that it might be possible if we use the combinations of parameterized CDF and
y = erfc(x)
to achieve successful fitting. So, could anyone provide any solutions or ways to fit the above two relations? Giving some valuable ideas will also be very helpful to me.
PS: For now I don't care any hidden physics inside these data, and all I want to do is to find a mathematical way to fit the above two relations in MATLAB.
Thanks!

plotting histogram in matlab with highly unequal distribution

So I am plotting two years worth of data to see the change in distribution of values, but with one of the histograms, one of the years is heavily dependent upon the first column, this is because there are many zeros in the data set. Would you recommend creating a bar strictly for zeros? How would I create this index in matlabd? Or how can I better manipulate the histogram to reflect the actual data set and make it clear that zeros are accounting for the sharp initial rise?
Thanks.
This is more of a statistical question. I f you have good reason to ignore the zeros, for example one of your data aquisition system produced them because of malfunction. you can get simply get rid of them by
hist(data(data~=0))
but you would not need to look at the histograms anyways you can use the variance or even standard deviation to see how much your data shifted.
Furthermore to compare data populations boxplots are much better and easier to handle.
doc boxplot
If on the other hand your zeros are genuine to your data you have to keep them! I am sorry but also here the boxplot function might help you because the zeros might be outliers (shown as little red crosses) or the box is just starting at the zero line.

Matlab Image Histogram Analysis: how do I test for an underlying bimodal distribution?

I am working with image processing in MATLAB. I have two different images whose histogram plots are as shown below.
Image 1:
and
Image 2:
I have multiple images like those and the only distinguishing(separating) features is that some have single peak and others have two peaks.
In other words some can be thresholded (to generate good results) while others cannot. Is there any way I can separate the two images? Are there any functions that do so in MATLAB or any reference code that will help?
The function used is imhist()
If you mean "distinguish" by "separate", then yes: The property you describe is called bimodality, i.e. you have 2 peaks that can be seperated by one threshold. So your question is actually "how do I test for an underlying bimodal distribution?"
One option to do this programmatically is Binning. This is not the most robust method but the easiest. It might work, it might not.
Kernel Smoothing is probably the more robust solution. You basically shift and scale a certain function (e.g. Gaussian) to fit the data. This can be done with histfit in matlab.
There's more solutions for this problem which you can research for yourself since you now know the terms needed. Be aware though that your problem is not a trivial one if you want to do it properly.

Curve fitting, but I want to guarantee only one inflection point

I often find myself fitting a scatter plot, and knowing that the 'true fit' should have only one inflection point. Any ideas for forcing a fit that will obey this?
I am using Matlab and Microsoft Excel
Many thanks
Option 1:
I like to use spline smoothing with Akaike information criteria, and while it is a hyper-parametric fit and has a large number of analytic candidate inflection points, the smoothed data at the sample points tends to reveal only what is within the data.
If your data doesn't actually have an inflection point, this is indicated. If it does, it is also usually captured. Statistical jargon for an important cousin to this is called a "non-informative prior".
Try slides 30-31 here: link.
Option 2:
If you have an older version of MatLab then you can specify the exact model easily in the "cftool" (not the same as sftool) then get m-file that gives how you put it into your own script. Pick a model appropriate to your data.

Functional form of 2D interpolation in Matlab

I need to construct an interpolating function from a 2D array of data. The reason I need something that returns an actual function is, that I need to be able to evaluate the function as part of an expression that I need to numerically integrate.
For that reason, "interp2" doesn't cut it: it does not return a function.
I could use "TriScatteredInterp", but that's heavy-weight: my grid is equally spaced (and big); so I don't need the delaunay triangularisation.
Are there any alternatives?
(Apologies for the 'late' answer, but I have some suggestions that might help others if the existing answer doesn't help them)
It's not clear from your question how accurate the resulting function needs to be (or how big, 'big' is), but one approach that you could adopt is to regress the data points that you have using a least-squares or Kalman filter-based method. You'd need to do this with a number of candidate function forms and then choose the one that is 'best', for example by using an measure such as MAE or MSE.
Of course this requires some idea of what the form underlying function could be, but your question isn't clear as to whether you have this kind of information.
Another approach that could work (and requires no knowledge of what the underlying function might be) is the use of the fuzzy transform (F-transform) to generate line segments that provide local approximations to the surface.
The method for this would be:
Define a 2D universe that includes the x and y domains of your input data
Create a 2D fuzzy partition of this universe - chosing partition sizes that give the accuracy you require
Apply the discrete F-transform using your input data to generate fuzzy data points in a 3D fuzzy space
Pass the inverse F-transform as a function handle (along with the fuzzy data points) to your integration function
If you're not familiar with the F-transform then I posted a blog a while ago about how the F-transform can be used as a universal approximator in a 1D case: http://iainism-blogism.blogspot.co.uk/2012/01/fuzzy-wuzzy-was.html
To see the mathematics behind the method and extend it to a multidimensional case then the University of Ostravia has published a PhD thesis that explains its application to various engineering problems and also provides an example of how it is constructed for the case of a 2D universe: http://irafm.osu.cz/f/PhD_theses/Stepnicka.pdf
If you want a function handle, why not define f=#(xi,yi)interp2(X,Y,Z,xi,yi) ?
It might be a little slow, but I think it should work.
If I understand you correctly, you want to perform a surface/line integral of 2-D data. There are ways to do it but maybe not the way you want it. I had the exact same problem and it's annoying! The only way I solved it was using the Surface Fitting Tool (sftool) to create a surface then integrating it.
After you create your fit using the tool (it has a GUI as well), it will generate an sftool object which you can then integrate in (2-D) using quad2d
I also tried your method of using interp2 and got the results (which were similar to the sfobject) but I had no idea how to do a numerical integration (line/surface) with the data. Creating thesfobject and then integrating it was much faster.
It was the first time I do something like this so I confirmed it using a numerically evaluated line integral. According to Stoke's theorem, the surface integral and the line integral should be the same and it did turn out to be the same.
I asked this question in the mathematics stackexchange, wanted to do a line integral of 2-d data, ended up doing a surface integral and then confirming the answer using a line integral!