Generate a probability distribution from a histogram with two peaks - matlab

I have a histogram with two peaks and I want to generate the corresponding probability distribution. I have used the following MATLAB code:
A=mydata;
M1=max(A);
M2=min(A);
I=(0:100).*(M1-M2)./100+M2;
[n,x]=hist(A,I);
bar(x,n/(1000*0.352))
I found this code frequently to explain how we can find a prob distribution for a histogram of random numbers normally distributed but I don't know whether if this true for a histogram with two peaks and generate a normalised probability distribution.

Try using this FileExchange submission - ALLFITDIST.
Not sure it can fit two peaks. But since they are quite far from each other, you can try to fit by range and then sum them together.

Related

Is there a way to get the probability from the probability density in multivariate kernel estimation?

I have a question about multivariate kernel density in matlab, which is my first time using it.
I have a 3-dimensional sample data (x, y, z in axes) and want to find a probability of being in a certain volume using kernel density estimation. So, I used the mvksdensity function in matlab and got the probability density (estimated function values) for the points I decided.
What I originally wanted to do was to (if I could fine the function) triple integral the multivariate function for a given volume. But the mvksdensity function only returns the density estimates and does not return the function. I thought there will be an easy way to compute the probability from the density, but I’m stuck. Does anyone have any useful information for this? Thanks in advance.
I thought about fitdist function to find the distribution, but it only works for univariate kernel distribution.
I also tried to use mvncdf, which is a function that returns the cdf of the multivariate normal distribution for the row of the sample data after setting the mean and the std. But then I have to calculate the probability for a given volume for every normal distribution in each data point and then add it, which will be inefficient for a large amount of data and I don't know if it's a correct way.
I can suggest the following Monte-Carlo approach. You find a master volume that contains the entire mass of the estimated probability density. This should be as small as possible for the sake of efficiency. Then you generate a large number of test points in the master volume, either on a grid or randomly according to a uniform distribution. The probability content of a specific volume V can be estimated by the sum of the density values of the test points in V over the sum of the density values of all test points. I am afraid, however, that in 3D you would need at least 1E6 test points, probably more. If you give me access to your sample, I would be pleased to try out my suggestion. It should also be fairly easy to work out an estimate of the standard error of the estimated probability content of V.

Fit MRI data to a noncentral chi distribution

I'm working on Magnetic Resonance Imaging data, on Matlab R2020a. In particular, i have to characterize the background noise of the image and i know it has a noncentral chi distribution. Now, i'm trying the mle method whitout results:
[phat,pci] = mle(data,'pdf',#(data,v,d)ncx2pdf(data,v,d),'start',[1 1]);
data is a row vector (1024,1), v are the dof of the distribution and d the noncentrality parameter (the two parameters that i have to find).
The problem lies in the fact that the distribution strongly depends on the value of the mean, and the order of magnitude of my data is 10^-6.
histogram of the data:
Does anyone know a method to fit the data to a noncentral chi distribution? I already tried the 0-1 and 0-255 normalization, but they produces unreliable mean values. Any suggestions are welcome.
data.mat

How to generate a 2D random vector in MATLAB?

I have a non-negative function f defined on a unit square S = [0,1] x [0,1] such that
My question is, how can I use MATLAB to generate a 2D random vector from S according to the probability density function f?
Rejection Sampling
The suggestion Luis Mendo made is very good because it applies to nearly all distribution functions. Based on this answer I wrote code for m.
An important point when using rejection sampling this way is that you must know the maximum of your pdf within the range. If you over-estimate the maximum your code will only run slower. If you under-estimate it it will create wrong numbers!
The idea is that you sample many uniform distributed points and accept depending on the probability density for the points.
pdf=#(x).5.*x(:,1)+3./2.*x(:,2);
maximum=2; %Right maximum for THIS EXAMPLE.
%If you are unable to determine the maximum of your
%function within the [0,1]x[0,1] range, please give an example.
result=[];
n=10;
while (size(result,1)<n)
%1. sample random point:
val=rand(1,2);
%2. Accept with probability pdf(val)/maximum
if rand<pdf(val)/maximum
%append to solution
result(end+1,:)=val;
end
end
I know that this solution is not a fast implementation, but I wanted to start with an implementation as simple as possible to make sure that the concept of rejection sampling becomes clear.
ICDF
Besides rejection sampling there is a different approach to address this issue on a more mathematical level, but you need to sit down and do some math first to end up with a better solution. For 1 dimensional distributions you typically sample using the ICDF (inverted cumulative density function) function simply using ICDF(rand(n,1)) to get random samples.
If you manage to do the math, you could instead for your PDF function define two functions ICDF1 (ICDF for the first dimension) and ICDF2 (ICDF for the second dimension) in matlab.
The first ICDF1 would map unifrom random distributed samples to sample values for the first dimension of your random distribution.
The second ICDF2 would map the output if ICDF1 and uniform distributed samples to your intended solution.
Here is some matlab code assuming you already defined ICDF1 and ICDF2
samples=ICDF1(rand(n,1));
samples(:,2)=ICDF2(samples,rand(n,1));
The great advantage of this solution is, that it does not reject any samples, being potentially much faster.

Match template histogram with testing histogram

How can we calculate the percentage of similarities between two pattern of Histogram?
For example, I have a histogram of template which I called HistA, and I have another histogram which is HistB where I want to check the similarities percentage of HistB with HistA.
I tried check out some of method such as histogram equalization, histogram matching but none of them works with my problem.
As image below, I create a multiple histogram between HistA and HistB. The value of the frequencies were actually value from a 1D data.
I saw the pattern of HistA and HistB almost the same, so I want to know how to calculate the percentage of the similarities of this two histogram.
Measure Bhattacharya co-efficient between the two normalized histograms and as
where N is the number of bins in the histograms.
Note the normalization.
For more information, see Bhattacharya distance|Wikipedia or On a measure of divergence between two statistical populations defined by their probability distributions.

How to plot a probability density distribution graph in MATLAB?

I have about 10000 floating point data, and have read them into a single row matrix.
Now I would like to plot them and show their distribution, would there be some simple functions to do that?
plot() actually plots value with respect to data number...which is not what I want
bar() is similar to what I want, but actually I would like to lower the sample rate and merge neighbor bars which are close enough (e.g. one bar for 0.50-0.55, and one bar for 0.55-0.60, etc) instead of having one single bar for every single data sample.
would there be a function to calculate this distribution by dividing the range into small steps, and outputting the prob density in each step?
Thank you!
hist() would be best. It plots a histogram, with a lot of options which you can see by doc hist, or by checking the Matlab website. Options include a specified number of bins, or a range of bins. This will plot a histogram of 1000 normally random points, with 50 bins.
hist(randn(1000,1),50)